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ABSTRACT 


This research represents an extensive study of the Augmented 
Lagrangian (ALAG) Penalty Function Algorithm for optimizing nonlinear 
mathematical models. The mathematical models of interest are 
deterministic in nature and finite dimensional optimization is assumed. 

A detailed review of penalty function techniques in general and the ALAG 
technique in particular is presented. Numerical experiments are conducted 
utilizing a number of nonlinear optimization problems to identify an 
efficient ALAG Penalty Function Technique for computer implementation. 
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I. INTRODUCTION 


The current advanced stage of development of the theoretical 
framework of *uncons trained optimization has served as a powerful force 
for unification of the subject which, until some years ago, consisted 
of a collection of disjointed algorithms. The evolution of these 
algorithms depended strongly on practical computation of solution to 
specific problems. The interplay of theory and algorithms has made 
it possible to transfer theoretical progress into improved algorithms. 

Powell (P5) has reviewed comprehensively modern algorithms and 
the effect of theoretical work on the design of practical algorithms 
for unconstrained optimization. Murray (Mil) has presented the main- 
stream of developments in numerical methods for unconstrained optimization. 
Much of the current research has been focused on understanding, comparing, 
improving and extending the available numerical methods instead of 
devising totally new algorithmic concepts. These refinements and modifi- 
cations are not expected to significantly improve the efficiency of existing 
algorithms (G2) . 

At present a robust collection of potent and sophisticated general 
purpose algorithms for unconstrained optimization is available as high- 
quality software (G2) . These algorithms have been tested and proven to 
be efficient and reliable for solving a variety of typical test problems 
and practical problems. Successful development of such algorithms for 
unconstrained optimization has been the springboard for the more recent 
success in the design of algorithms for constrained problems. 
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Availability of efficient numerical methods for solving 
unconstrained optimization problems has motivated the design of 
algorithms that convert a constrained problem to a sequence of 
unconstrained problems which have the property that successive 
solutions of the unconstrained problems converge to the solution of 
the constrained problem. This transformation approach has been 
systematically employed in the development of numerical algorithms 
for constrained optimization for more than a decade. In recent years 
a substantial body of theory has been established for these transfor- 
mation techniques and many computational algorithms have been 
proposed (B4) , (FI) , (L3) . 

To review briefly the transformation technique, consider the 

following inequality constrained nonlinear programming problem. Let 

( 2 ) 

f(X) and c. (X) i = l,2,....,m be real valued functions of class C 

'V i 

on a nonempty open set L in an n-dimensional Euclidean space E 
PI : Minimize f ()C) over all X e L 

Subject to c^ (X) > 0, i = l,2,....,m 
where feasible region F is a nonempty compact set. 

F = {X : c^ (X) >0 i = 1,2, .... ,m, X e L, L S E'^} 

Methods for solving PI via unconstrained minimization have been 
classified, described and analyzed in detail by Lootsma (L3) . Para- 
metric transformation methods solve PI by reducing the computational 
process to a sequence of successive unconstrained minimizations of a 
compound function defined in terms of the objective function f (X), 
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the constraint functions c. (X) i = and one or more 

controlling parameters. By gradually removing the effect of the 
constraints in the compound function by controlled changes in the value 
of one or more parameters a sequence of unconstrained problems is 
generated. Successive solutions of these unconstrained problems 
converge to a solution of the original constrained problem. The 
advantage of this approach lies in the fact that the constraints need 
not be dealt with separately and that efficient numerical methods for 
computing unconstrained extrema can be applied. 

During recent years the parametric transformation technique known 
as the Augmented Lagrangian (ALAG) Penalty Function Technique has 
gained recognition as one of the most effective type of methods for 
•solving constrained minimization problems. In the opinion of many 
researchers in this field, the ALAG penalty function technique is the 
best method available for solving problems with nonlinear constraints 
in the absence of special structure (B4) . The disadvantages of the method 
are negligible and the advantages are strong, especially the lack of 
numerical difficulties and the ease of using the unconstrained minimi- 
zation routine. The method has global convergence at an ultimately 
superlinear rate, the computational effort per minimization falls off 
rapidly, initial starting point need not be feasible and the function is 
defined for all values of the parameters (F7) . 

The ALAG penalty function technique is a balance between the 
classical penalty function technique and the Lagrangian primal-dual 
method which are both parametric transformation techniques. The design 
of this method was motivated by efforts to overcome the numerical in- 
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stability of the penalty function technique near the solution (P3) , 

(H2) and attempts to eliminate the "duality gap" in nonconvex 
programming (R6) . The classical penalty function technique and the 
Lagrangian pi»imal-dual method are briefly reviewed and the development 
of the ALAG penalty function technique by. the merger of the penalty 
idea with the primal-dual philosophy is traced in section 2. The 
ALAG penalty, function technique is described, reviewed and discussed in 
section 3. The results of numerical investigations are presented in 
section 4. The symbols, mathematical terms and related concepts used 
in this work are defined briefly in appendix A. The method of solving 
a nonlinear problem using the ALAG penalty function technique is 
illustrated with numerical examples in appendix B. 
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II. REVIEW OF RELATED MINIMIZATION TECHNIQUES 
2.1 Penalty Function Technique 

The peitalty methods have been extensively used in numerical optimiza- 
tion for more than a decade. The penalty function approach has been popular, 
as evidenced by applications to practical problems (D3) , because it is 
conceptually simple and easy to implement. It permits a transparent program 
structure as it is fully based on unconstrained minimization. These methods 
are applicable to a broad class of problems, even those involving noncpnvex 
constraints. The most attractive feature of these methods is the fact that 
they take advantage of the powerful unconstrained minimization methods that 
have been developed in recent years. 

The penalty function technique is a sequential parametric transformation 
method. It is an iterative algorithm that requires the solution of an 
unconstrained optimization problem at each iteration. In these methods 
the objective function f(X) is minimized using an unconstrained minimization 

'\j 

technique while maintaining implicit control over the constraint violations 

by penalizing the objective function at points which violate or tend to 

violate the constraints. The solution X* to the constrained minimization 

a. 

problem Pi is approached from outside the feasible region F and these 
methods are also referred to as exterior point methods . The penalty 
function technique has been popularized mainly through the work of Fiacco 
and McCormick (Fl) . Fiacco and McCormick (FI) developed the Sequential 
Unconstrained Minimization Techniques (SUMT) for nonlinear programming 
using penalty function and related concepts. A chronological survey of 
the development of the penalty methods and detailed discussion and analysis 
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of penalty and related methods are presented in reference (FI) . 

The penalty function method for PI consists of sequential minimizations 
of the form 

mxnimize P(X, a) , X e L S 

*\/ 

P(X, a) is the penalty function with control parameter o > 0. This function 

'h 

is designed to impose an increasing penalty on the objective function as 
constraint violation increases. The control parameter a is used effectively 
to increase the magnitude of penalty. 

The penalty function transformation may be represented as 

m 

P(X, a) = f(X) +0 E n.(c.(X)), a > 0 where [1] 

Oj i— 1 ^ ^ ^ 

n^(t) is defined as the loss function with the following properties. 

(i) n^(t) is continuous on -«> < t < " 

(ii) for inequality constraint c . (X) > 0 

n^(t) « as t ^ and n^(t) = 0 for t > 0 

(iii) for equality constraint c.(X) = 0 
n^(t) > 0 Vt, n^(t) = 0 for t = 0 and 
ri^(t) ->■ « as t ±“ 

Usually the loss function, ri,(t), is chosen such that when the objective 

( 2 ) 

function and the constraint functions are of class C , P(X, a) is twice 
differentiable. P(X, o) is defined on an open set L S E and P(X, o) -> “> 
as constraint violation increases. 

Several different loss functions have been proposed for use in the 
penalty function algorithm and these are discussed by Fiacco and McCormick 



7 


(FI) . The most commonly and widely used loss function is the quadratic 

loss function. For an inequality constraint c.(X) > 0, quadratic loss 

1 = 

2 

function is n.(c (X)) = [min (0, c (X))] . For an equality constraint 

2 

c,(X) = 0, tl^ quadratic loss function is n.(c,(X)) = (c.(X)) . 


1 1 -V 


1 


An elaborate treatment of the penalty function algorithm can be found 
in (FI) , (L5) and (Zl) for a general nonlinear problem. The basic algorithm 
may be represented as follows : 

(k) 

(i) Select an infinite sequence {a } which is monotonically 
increasing as k ->■ “>. Find X^^^ ^ F, where F is the feasible 

'V. 

region defined by the constraint functions. Set k = 0. 

(ii) Set k = k + 1. 

(iii) Minimize P(X, a^^^) to find X(a^^^) = X^^^ starting the 

Oi ^ 'Xj 

(k-1) 

minimization from X' . Return to (ii) if convergence is 

% 

not satisfied. 


Convergence tests in step (iii) are usually based on the magnitude of 
quantities such as (f(X^*^^) - f(X^^ ^^)) II - X^^ || where || X || 

'V 'V/ '\j 

is the Euclidean norm of the vector X. Other convergence criteria are 

discussed by Fiacco and McCormick’ (FI) . It is assumed that the function 

(k) 

f(X) is bounded below so that a solution X^ to the unconstrained minimization 

'V '\f 

(k) 

in step (iii) exists for each a . In step (i) the initial starting point 


.( 0 ) .. 


'Xj 


is outside the feasible region F and the trajectory corresponding to 
the sequence {X^.^^} generated by the algorithm lies outside F. Therefore 

'V 

penalty function methods are also known as exterior-point methods. Any 

(k) 

limit point of the sequence {X^ '} generated by the penalty method is a 

'V 

solution X* to the constrained minimization problem PI (H4) , (L5) , (Zl) . 

'Xj 
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The penalty function technique might be regarded as a "primal" approach 
to implicitly account for the constraints, although its connections with 
duality are known (FI), (L5) , (Zl) . The approximation of the constrained 
problem by the unconstrained penalty problem becomes more and more exact as 
the control parameter a ^ However considerable computational difficulties 
are experienced with the traditional penalty function algorithm as a 
These difficulties are delineated in detail in references (L3) , (L5) , (M5), 

(Rll) . The computational difficulties arise from P(X, a) forming an increasingly 

'\j 

steep-sided valley as the control parameter is increased to allow the 
unconstrained solutions to approach the constrained solution to Pi from 
outside the active constraints. In particular, the Hessian matrix of the 
penalty function [1] becomes extremely ill-conditioned as a increases. 

This leads to numerical instabilities during unconstrained minimizations 
of the penalty function and slow convergence of the algorithm. 

Attempts to overcome these computational difficulties have resulted 
in several modifications (FI), (F2) , (L3) to the penalty function technique. 
Hestenes (H2) and Powell (P3) , at about the same time, independently 
proposed modifications that resulted in a new method related to the penalty 
function technique. In this new method penalty terms are added to the 
Lagrangian associated with the original constrained problem. Hestenes (H2) 
termed this the "Multiplier Method". It has become known as the Augmented 
Lagrangian Penalty Function Technique in subsequent discussions. This 
method alleviated some of the computational difficulties associated with 
the traditional penalty function technique (F8) and achieved better 
convergence properties than the method of penalty functions (H4) . This 
method is reviewed briefly in Chapter 3. 
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2.2 Lagrangian Primal-Dual Method 

The Lagrangian primal-dual method transforms a constrained convex 
programming problem into a sequence of unconstrained minimizations of the 
classical Lagrangian associated with the constrained minimization problem. 

The constrained problem PI becomes a convex programming problem when the 
objective function f (X) is convex and the constraints c,(X) i = 1, 2, ..., m 

Oi 1 'V 

are concave. The concept of the primal-dual method was first implemented 
by Arrow, et al. (Al) in the differential gradient scheme for approaching 
the saddle-point of the Lagrangian L(X, X) associated with a convex program. 

'\j -v. 

The Lagrangian associated with the convex problem PI may be represented as 


L(X, X) = f(X) - 

'\i 'h r\j 


m 

I 

i=l 


X, c,(X), 


X e i. ^ E", 


X e E 

'V. + 


[ 2 ] 


where E_^ is the nonnegative orthant of m-dimensional Euclidean space E 

m 

and the vector X e E. is called a vector of multipliers. 

Suppose that a point X* satisfies the constraints of the convex program 

'\i 

PI and the problem functions are of class . If there exists a vector 

X* such that 
% 

X* > 0, X.* c.(X*) = 0 V i and VL(X*, X*) =0, [3] 

— 1 X % >\i iXi '\j 


then X* is a global solution to the convex program PI. The vector X* is 
said to be the vector of Lagrange multipliers associated with X*. If the 
gradients of the active constraints at X* are linearly independent, then X* 

'Vj 'V 

is a regular point of the feasible region F and there exists a vector of 
Lagrange multipliers X* satisfying [3] . The conditions in [3] are called 

'X/ 

the Kuhn-Tucker first-order necessary conditions for X* to be a solution to 

'x^ 
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PI and for the convex program PI, these are also sufficient conditions for 
X* to be a global solution. For a nondif f erentiable convex problem PI let 

XI ^ 

there exist an X* e E and a X* e E. such that the pair (X*, X*) is the 

■V '\j ~ '\j '\j 

saddle-point of the Lagrangian L(X, X) associated with the convex program 

* . '\j 'X, , 

PI, i.e., L(X*, X) < L(X*, X*) < L(X, X*). Then X* is the global solution 
to the convex program PI and X* is the vector of Lagrange multipliers 

'Xj 

associated with X*. 

'Xj 

The differential gradient scheme of Arrow, et al. (Al) for a convex 
program may be viewed as a small-step primal-dual method where estimates 
of X* and X* are modified at each iteration to exploit the saddle-point 

Oi Oi 

nature of L(X, X) near (X*, X*). This structure of the method is revealed 

'\j 'Xj 'Xj 'Xi 

by the system of difference equations formulated by Uzawa (Al) to represent 
the differential gradient method. Davis (Dl) represents the iterations in 
this method as 


x(k+l) ^ _ ct VLCX^’^^ X^^^) 

O. r\j 1 1 'V % ^ 


■Ck+1) 


(k) 


,-l 


.(k) ,(k) 


X'" "' = min [0, X' " - a- B * VXL(X" ' , X"*""] 

f\j t\j L L f\/\j f\, r\j 


where a and a are scalars representing step-size, VL(X, X) is the gradient 

X 2 Or 

of L(X, X) with respect to X, VXL(X, X) is the gradient of L(X, X) with 

'Xj 'Xi % rX/X, ■'Xi 'Xj 'Xj 'Xj 

respect to X and B ^ and B ^ are positive definite matrices of order n and 

X 2 

( 0 ) ( 0 ) ® 

m respectively. The algorithm may be started at any ' £ f and X £ E,. 

'Xj 'V 

As the constrained problem in the above method is convex, the 


Lagrangian L(X, X) is also convex with respect to X. The iterations on 

'Xj '\j 'Xj 

(k) 

X , therefore, are descent iterations on L(X, X) and update of multipliers 
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may be viewed as ascent iterations on L(X, X). The X^^^ update may 

'V ^ 'V 'Xi 

also be regarded as approximate solutions to the associated dual problem 

(k) 

at X , The dual associated with PI is 


m 

Dl*: Maximize IA(X) over all X e E 

'K, "v 

l/(X) = infimum L(X, X), X e L , 

'\j ' ^ ^ ^ ^ 

% 


The Lagrangian L(X, X) is minimized over X e L for a sequence of multiplier 
(k) 

vectors X and the algorithm is a primal-dual method. Methods that are 


a. 

similar in concept to this algorithm are described by Powell (PA) , Bertsekas 
(BA) , and Lasden (LI) . 

The algorithms based on Lagrangian primal-dual method are not susceptible 
to numerical instabilities such as those discussed in connection with the 
penalty method. Primal-dual methods are based on the viewpoint that the 
Lagrange multipliers X* are also fundamental unknowns associated with a 


constrained problem. This is due to the reason that Lagrange multipliers 
measure sensitivities and often have meaningful interpretations as prices 
associated with constraint resources (HA), (L5). Useful duality results 
for convex programs have been presented by Luenberger (L5) and Zangwill (Zl) . 
Various formulations of the duality theory for nonlinear convex programs 
using the classical Lagrangian have been reworked and extended by Geoff rion 
(Gl) so as to facilitate, more readily, computational and theoretical 
applications. Methods based on the classical Lagrangian for solving a 
constrained problem PI have been reviewed by Lootsma (L3) . 

The Lagrangian primal-dual method is known to have serious disadvantages 
(R3) , (R6) . The most restrictive one is that the constrained problem must 
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be convex In order for the dual problem to be well defined and X iterations 
to be meaningful. In general inf (PI) > sup (Dl) and the equality holds 
good only for the convex problem PI. For nonconvex problems only the 
inequality hcvlds in the above relationship and in such cases a duality gap 
is said to exist. For nonconvex problems Everett (E2) introduced a primal 
dual method called generalized Lagrange multiplier method. This and other 
associated methods are summarized by Lootsma (L3) . Even though Everett (E2) 
suggested some methods of handling the duality gap, the method has been 
found to be useful only for certain nonlinear problems with special structure. 
The method is of importance in the decompoisition of large-scale problems 
with separable functions. In such cases minimization of the Lagrangian can 
be carried out efficiently due to the special structure of the constrained 
problem (E2) , (LI) , (L5) . 

For a convex program, if X* is the optimal solution to the constrained 

'h 

problem with corresponding Lagrange multiplier vector X*, then X* is the 

r\j 

unconstrained minimizer of L(X, X*). However, if X* is a local solution to 

Oi a. , '\j 

a nonconvex program with corresponding Lagrange multiplier vector X*, then 

'Vi 

X* may not be the unconstrained local minimizer of L(X, X*) and L(X, X*) may 
even have negative second derivatives at X* in certain directions normal 
to the feasible manifold F (R3) . Since curvature at a point is determined 
by the second partial derivatives, attempts were made to make the Lagrangian 
associated with nonconvex programs a convex function by adding quadratic 
penalty terms to it. This concept was first suggested by Arrow and Solow (Al) 
in connection with the solution of a nonconvex equality constrained problem 
using the differential gradient method. Arrow and Solow augmented the 
classical Lagrangian with quadratic penalty terms and this elegant idea 
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made the new augmented Lagrangian locally convex. This idea was independently 
reconsidered in an entirely different algorithmic context for equality 
constrained problems by Hestenes (H2) , Powell (P3) and Haarhoff and Buys (Hi). 
The algorithms that resulted from these efforts belong to the Augmented 
Lagrangian Penalty Function Technique which is reviewed in Section 3. 
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III. AUGMENTED LAGRANGIAN PENALTY FUNCTION TECHNIQUE 
3.1 Introduction 

The ALA'G penalty function technique may be reviewed from two entirely 
different points of view. The first view-point is that the methods that 
belong to this technique modify the Lagrangian associated with a nonconvex 
or a weakly convex constrained problem to have a local convexity property. 

This is because the characterization of solution to a constrained problem 
in terms of a saddle-point of the Lagrangian depends heavily on convexity 
properties of the underlying problem. The local saddle-point property is 
obtained by the presence of a convexifying parameter in the Lagrangian 
which makes the associated Hessian positive definite for large enough, but 
•finite, values of the parameter. Following this idea of local convexif ication 
many different modifications of the classical Lagrangian have been proposed 
to close the duality gap in nonconvex programming (Al) , (A2) , (M2) . 

The second viewpoint is to consider the technique and the quadratic 
penalty function method within a common generalized penalty function frame- 
work. The approach here is to circumvent instabilities associated with the 
classical penalty function method by adding penalty terms to the Lagrangian 
function. The advantages of using a first-order penalty furnction have 
been listed by Lootsma (L3) and McCormick (M5) . Therefore methods that 
augment the Lagrangian with quadratic penalty terms are considered in detail. 
The development of the ALAG penalty function technique is traced from the 
second viewpoint. 



3,2 Review of the Technique for Equality Constrained Problem 


3.2.1 Equality Constrained Problem 

The equality constrained problem P2 may be represented as follows: 


P2: Minimize f (X) 

Subject to c.(X) =0 i=l,2, ...,m m<n 

( 2 ) 

f (X) and c . (X) i = 1, 2, ..., m are real-valued functions of class 

'X, ^0/ 

defined on a nonempty open set LSl.E^. The Lagrangian associated with P2 


m 

L(X, X) = f(X) - I X. c.(X), X e E“., 

'\j '\j 'Xj 1 X ^ 


The gradient and Hessian of this Lagrangian with respect to X are VL(X, X) 

0./ ^ 

2 

and V L(X, X) respectively. 

a, 'h 

Let X* be an optimal solution to P2 and the problem functions f (X) 

'V 'V 

( 2 ) 

and c.(X), i = 1, 2, . . . , m be of class C' ' in an open neighborhood of X* 
1 . “V 

The following are assumed to hold good at X*. 

(i) The point X* is a regular point of the feasible set 

'X,' 

F = {X: c. (X) = 0 i = 1, 2 X e L S e’^} 








Let N* = N(X*) be the nxm matrix [Vc , Vc , ..., Vc ]. 

'X; r\j J. 'X/ Z 'X/ in 


The regularity of the feasible set at X* is satisfied when 

'V 


N* is of full rank. 

(ii) There exists an unique Lagrange multiplier vector X* such 
that the following first-order necessary conditions for 
local optimality at X* are satisfied. 
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(iii) The second-order necessary conditions for local optimality 
at X* are that in addition to [5] 

'\j 

V^L(X*, X*) Y > 0 , V Y e e" [6] 

'Vi >\j <\j '\j ~ a. 

y = {Y: Y^ Vc. = 0 Vi} 

% % a. 1 

(iv) The second-order sufficient conditions for X* to be an isolated 
local minimum are that in addition to [5] 

Y^ V^L(X*, X*) Y > 0 V nonzero Y e V [7] 

'\j '\j 'V/ 

(v) Strict complementarity holds at X*, i.e. , X.* 0 Vi 

'v< r 

3.2.2 Powell - Hestenes Augmented Penalty Function 

Powell (P3) suggested the following penalty function to solve P2, 

1 ^ 2 
<|.(X, 0, s) = f(x) + ^ I o.(c (X) - e.)^ 

'Xt *\j '\j ^ 1=1 1 1 i 

= f(X) (c(X) - e)’’^s (c(X) - 0) [8] 

where 0 e C(X) is a vector of constraint functions c . (X) i = 1, 2, ..., m 

>\j 1 a. 

and S is a diagonal matrix of order ra with, diagonal elements > 0. Let 

a e E™ be a vector with a. as components. While the classical penalty 

r\j ' * 1 

function for P2 contains at most m control parameters, the above function 
depends on 2m parameters which are the components of 0 and o. The main 
difference between classical quadratic penalty function and [8] is the 
introduction of parameters 0. In [8] quadratic penalty terms have been 
added to the Lagrangian associated with P2. 

The augmented penalty function 4> is used in the algorithm as follows. 
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Algorithm Al: 


(i) Select 9 

Oi 


( 1 ) 


0., k = 0, = I and ^ F. 


( 0 ) 


(ii) k = k + 1 


(iii) Minimize <|>(X, 0 to find 

a. '\i 


starting the unconstrained 

% 'Kj . 


minimization from X 
(k) 

(k+1) _ Ak) 


(k-1) 


(iv) If CCX' ■') is converging sufficiently rapidly to zero then 

'Xj Oj 

g(k+l) ^ g(k) 

O. '\j 

^ (k+1) , (k) 


= 0"^^" - C(X^^^ 

'\j '\j <\i 


= S and return to (ii) 


otherwise 

.(k+1) 


= (1/10) 0 


(k) 


g(k+l) _ and return to (ii). 


•In step (ii) (j) is minimized with respect to X without constraints for fixed 

'V 

(k) (k) 

values of 0 and S and this is the inner iteration of the algorithm. 

'\j 

(k) (k) 

Step (iii) is the outer iteration in which 0 and S are changed to 

'V 

force constraint satisfaction and cause the sequence of solutions 

a. 

to converge to X* at a reasonably fast rate. 

The scheme for adjusting 0 parameters in the outer iteration is 

% 

(k) (k) (k) 

based on the observation that if X' is the minimizer of ({’(X, 0 , S^ '^) 

“V o> ''' 

(k) 

in the inner iteration, then X' is also a solution of the problem 

'U 


Minimize f (X) X e L ^ 


Subject to C(X) = C(X^*^^) 

'X, f\j ^ '\j 


In order to solve the equality constrained problem P2 it is sufficient to 
find 0^^^ and S^'^^ such that X^^^ = X(6^^\ solves the system of 

Oi '\j '\i 


nonlinear equations 
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c(x(e<«, 

'V Oi % 


0 . 


[9] 


The above system of equations is in terms of 2m parameters 0 and a 

i i 

(k) (k) 

i == 1, 2, . . . , m. One vector of parameters 0 ^ or.o^ ' may be fixed and 

[9] then is a* system of m equations in m remaining parameters. 

(k) 

If 0 is fixed, then [8] reduces to a basic penalty transformation. 

f\j 

Specifically when 0 parameters are set to zero, ij) becomes the classical 

a. 

quadratic penalty function. In such a case convergence of the sequence 
(k) 

{X } to X* is ensured by letting o i = 1 , 2 , . . . , m. This leads 

a. O) 1 

to numerical instabilities and slow convergence. Therefore in Powell's 
(k) (k) 

method S is held constant and 0^ is changed to force constraint 

satisfaction through iterative solution of [9]. Powell (P2) derived a 

(k) (k) 

simple correction for adjusting 0^ parameters when S is fixed by 

’\j 

■applying generalized Newton iteration to solve [9]. This correction is 
represented as 


0(k+l) ^ g(k) _ 

f\j '\j % 'V 


[ 10 ] 


(k) (k) (k) 

By definition X is the unconstrained minimizer of (i>(X, 0 , S ). 

'h '\j '\j 

Therefore 0 =0, i.e.. 


m 


Vf(X^^^^) + f a.(C.(X^^^) - 0 /^^) VC.(X^^^) = 0. 

'X, <vi i=l ^ 1" 'V ^ v ^ 


[ 11 ] 


Continuity of C(X) in the neighborhood of the local minimizer X* of P2 

'X, '\j 

(k) (k) 

implies that the matrix N(X' ') is of full rank for X^ ^ sufficiently close 


fk) (k) 

to X*. When X^ is in the neighborhood of X* and when X' 

O) 'Xj '\j 


X* the estimates 

'h 


. -s« (CCX<‘'>) - e‘W) 


[ 12 ] 
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exist and have as limit points the unique values X* = S* 0*, where 0* and S* 
are the parameters corresponding to X*. Hence the final value of the 

'V 

product S 0, in the limiting sense, is a constant and may considered to be 
independent of S when S is fixed and 0 is adiusted to let 0 0*. Due to 

A. rv 

this reason, when S is increased in step (iii) of algorithm A1 to improve 

(k.) (k) 

the rate of convergence of the sequence {max c.(X } to 0 and {X^ to 

i X 'V 

X*, 0 is decreased to keep S 0 a constant. 


Convergence of the algorithm is measured using the sequence (maxi c . (X^^^ ) | } 

. i 'Xj 

1 

Under the assumptions in 3.2.1 and when the Hessian matrix of (}i is positive 

definite at X*, Powell (P2) proved that the rate of . convergence is linear 

and the convergence ratio depends on l/o^ lot > o'. The threshold value 

o' is a large but finite positive real number. Therefore by choosing S to 

be large so that S is close to S’, where S' = o’ I, the algorithm can be 

made to have linear convergence at any arbitrary rate. Superlinear convergence 

is achieved when o^ In Powell’s algorithm the rate of convergence is 

taken to be satisfactory when the maximum residual, max | c . (X^^^ ) | , of the 

, X 'V 
X 

system of equations [9] is reduced by a factor of four on each iteration. 

The reason for preferring the slower rate of convergence implied by the use 

of factor four is that faster convergence tends to make the inner iterations 

more difficult (P2) . When the sequence {max | c . (X^^^ ) | } either fails to 

i ^ ^ 

converge or converges to zero at too slow a rate, S is increased by a 

factor of ten. The choice of factor ten to increase S is arbitrary. 

Numerical evidence indicates that the value of is seldom required to 

2 

be greater than 10 to ensure rapid convergence (Rll) . 

The Hessian matrix of ij) depends on both 0 and S. The change in this 
matrix is dominated by the increase in S (P2) . This is another reason 
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for using a factor of ten to increase S when the rate of convergence is 
slow and keeping S constant when rate of convergence is satisfactory. If 
S is chosen to be large in the initial iteration, instead of gradually 
increasing S, the Hessian of <}i becomes ill-conditioned and the unconstrained 
minimization of in the inner iteration becomes very difficult to perform. 
Further for a large S, an arbitrary starting point and arbitrary values 

'\j 

(k) 

of 0 parameters, the sequence {X^ } may not converge to X*. Therefore S 

Ck) (k) 

is increased so as to force X^ ' into a region in which sequence {X 

"V 'V 

locally converges to X*. Once this region is reached, S is kept constant 

'V; 

(k) 

and 6 parameters are adjusted so as to let X ^ X*. Further the gradual 

'\i 'h 

increase of S is designed to make ^ continuous and continuously differentiable 

with respect to X for all values of the parameters. In Powell's algorithm 

the minimizations in the inner iteration are not beset by computational 

difficulties associated with the basic penalty function transformations. 

The minimizations are well scaled and progressively less computational 

(k) 

effort is required as k increases and X X*. 

'V 'V> 

Hestenes (H2) , independently of Powell and at about the same time, 

proposed a similar method for solving P2 and he called it the method of 

multipliers. The method is based on the observation that if X* is a 

nonsingular minimum of P2, there exists ai multiplier vector X* and a constant 

a such that X* affords an unconstrained local minimum to the function 
0/ 


T(X, X*, S) = f(X) - C(X) + 1/2 (C(X))^ SC(X) [13] 

where S = ol. Conversely, if C(X*) = 0 and X* affords a minimum to [13], 

^ 'V 'Xi 

then X* affords a minimum to P2. In the method of multipliers a large 
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positive constant a is suitably chosen and is held fast. The augmented 
penalty function considered is 


T(X, X, S) = f(X) - C(X) + 1/2 (CCX))”^ SC(X) 

*\j '\j '\j r\j '\j 'yj ''''Vi 'V'Vi 


[ 14 ] 


where X e B and B is an arbitrary compact subset of E™. The function in 
'Vi 

[14] is sequentially minimized for successive estimates X of the unique 

'V . 

Lagrange multiplier vector X* at X*. 

'Vi 'Vi 

(k) (k) 

The unconstrained minimization of T(X, X , S) for an estimate X^ 

-Vi 'V. 'Vi 

(k) (k) 

of X* is the inner iteration. Let X^ = X(X^ , S) be an unconstrained 

'v 'V, i\i 

(k) (k) 

minimlzer of T(X, X' , S) . In the outer iteration the estimate X^ is 

'Vi 

(k) 

updated so as to cause X ' X*. Hestenes suggested the following formula 

“Vi iv, 

(k) 

for adjusting the multiplier vector X 


^<k«) ^ ^(k) _ ^(k) 




% 


[15] 


where S I, 0 < f = yo srid 0 < y ^ 1- The relation 

(k) 

[15] is derived from the observation that X^ is a local minimizer of 

'V 

(k) (k+1) 

T(X, X , S) and X^ is chosen so that first order necessary conditions 


'Vi 


'Vi 


.(k) 


are satisfied at X for P2. Hestenes (H2) did not analyze the convergence 
of the method, but subsequently (H4) established that the method converges 
linearly and superlinear convergence may be achieved when a In 

practical applications very fast linear convergence occurs for a large 
but finite value of a. Convergence is induced by not only a large value 
of a but also by multiplier iteration [15] (F8) . 


In Powell's method when S is fixed and 0 parameters are adjusted to 

’\j 

(k) 

let X X*, the unique Lagrange multiplier vector X* = S 0*, where 0* 

r\j rfj <V, 'V' 'Vi 
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corresponds to the vector of parameters at X*. This implies that a 
connection can be established between the augmented function <j> in [8] and 
T in [14] using the relationship 


A, iL lj2y m a 


[16] 


From [8], [14] and [16], 


1 m 2 

<^(X, 0, S) = T(X, X, S) + - Z X ./a . 

'h '\i ^ 1 1 


[17], 


r\j 'V, 


The difference between <}i and T is independent of X. If X(0, S) and X(X, S) 

'V ix, Oi V 

are unconstrained minimizers of (t> and T respectively for any S and if 0 and 
X are related as in [16], then X(0, S) = X(X, S) . Therefore the iterative 




'V 




a. 


•methods suggested by Powell and Hestenes for changing 0 and X parameters 
are the same. 

In view of the equivalence relationship [17] between (j) and T, the 
numerical algorithm A1 is discussed in terms of the augmented penalty 
function T. In the outer iteration adjustment of X parameters using [15] 
is considered, assuming that 0 and X are related by [16], The algorithm 

'V) 

A1 is discussed and analyzed using X parameters to emphasize the primal-dual 

'V 

(k) 

nature of the method which iterates with an approximation X^ to the 

(k) 

Lagrange multipliers X* in such a way as to make X X*. 

r\j f\j 

The algorithm A1 is now modified and denoted as the Powell-Hestenes 

augmented penalty function algorithm A2. The convergence of the algorithm 

is measured in terms of B = max c.(X) . 

. X «v 
X 
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Algorithm A2 : 

(i) Select = X^^\ k. = 0, arbitrary starting 




<\j 


point X and B = Br, where B > max c.(X ) . 

'v> U L) = . ' 1 ' 

1 

(ii) k = k + 1 

(iii) Minimize T(X, X^’^^ to find X^’^^ = X(X^^\ 

'Vi 'Vi 'h 'Xj 

(k-1) 

Starting the unconstrained minimization from X \ 

(iv) Find V = {i: |c (X^^^)| > B^^^/4}. 

If max|c (X^^^)| > B^*^\ set = B^^\ Go to (vii) . 

i i ^ 


,(k+l) 


(v) = max ( c . (X^^^ ) I . If b''*^''^'' < E stop. The E is a 

^ t - 

specified tolerance for B. 


, (k+1) 


(Vi) If < B^^)/4 or X^*^^ = 

“ 'V ^ 

set = x'") - s<>‘> C(X<«) 

'\j ■ '\j ^ 

„(k+l) (k) ,..x 

S = S , go to (xi) . 


(vii) Set =. X^^^ 

'V, 0/ 

(k+1) (k) ri 

a. ' = 10 0. ' Vi e V, 

1 1 


go to (ii). 


When second order sufficiency conditions hold good at X* for P2, there exists 
a o' > 0 such that for o. > o' Vi, the Hessian matrix of both <|)(X, 0*, S) 

1 ~ '\j % 

and T(X, X*, S) at X* is positive definite and X* is a strong local minimum 

'V 'V • 'V % 

of ^(X, 0*, S) and T(X, X*, S) (B2) , (B7), (F8) , (H2) . It should be noted 

'V/ 

that the local convexity of <j)(X, 0*, S) and T(X, X*, S) near X* is established 

a, 'V/ 'V 

without any assumptions about the convexity of problem P2. The aim of the 

(k) (k) 

algorithm A2 is to keep S constant and adjust X so as to cause X X*. 

a. a. V 
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Therefore in subsequent discussions it is assumed that a > o' Vi have been 

i = 

chosen and held fast so that <j> and T are locally convex. Due to this reason 
the explicit dependence of X on S is dropped and X(X, S) is represented as 
X(X). 

'V * . 

Haarhoff and Buys (HI) proposed a numerical algorithm very similar to 
the Powell-Hestenes method. They were motivated by the following observations 
about the traditional quadratic penalty function approach to solve P2. Let 
the quadratic penalty function for P2 be 


P(X, a) = f(X) + o 


m 

Z 

i=l 


(c.(X))‘ 

X -V 


a > 0. 


Let X(a) be an unconstrained minimizer of P(X, a) for a large value of 
control parameter a and X* be a local minimizer of P2. The gradient of 
P(X, a) is zero at X(a) but the gradient at X* is Vf(X*). Therefore, in the 
usual case when Vf(X*) is nonzero, X(o) and X* have to be different. Let X 

'V/ 'V/ 

be a solution to the under-determined system of equations C(X) =0. At X 

<v a. x. 

the gradient of P(X, a) is Vf(X) which is generally, not zero. Therefore X 

'V 'Vi 

and X(a) are different and for any finite value of a, X(o) is neither a 

'V , 

solution to P2 nor satisfies C(X) = 0. Usually X(o) tends to X* when a “ 

'V % x, 

(L5), (Zl) . From these observations Haarhoff and Buys added a linear 


combination of constraints to P(X, o) to obtain 

'Xi 


T(X, X, S) 


f(x) - x^ 


C(X) + 4 (C(X))'^ SC(X) 

'V 'V/ 4 <V Xi ' % 


S 


oI 


where X e E™ and o > 0. This function achieved their objective, i.e., 
balanced the gradient of f (X) in the vicinity of the minimum by a linear 

'V 

combination of gradients of constraint functions C(X). 
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The augmented penalty function proposed by Haarhoff and Buys is 
identical to the Powell-Hestenes augmented penalty function for P2. However 
the numerical algorithm of Haarhoff and Buys has some distinct features. 

They noted that the multiplier updates [15] are valid only when the function 

(k) (k) 

T(X, 1 , S) is minimized exactly for each X and that it is better to 

'V Oi 'h 

(k) 

terminate the inner iterations when a better value of T(X, X , S) is 

'V, a, 

obtained. They suggested that the multipliers . in the outer iteration be 
obtained from the first order necessary condition. 

Vf(X^^^) = N(X^^^) X, X e E®. [18] 

"V 'X, 

The condition [18] represents an over-determined system of n equations in m 

(k) 

parameters. Taking the scalar product of [18] with each VC.(X ), the 

-V 1 'X. 

following system of equations is obtained, 

Vf(X^^^) = N(X^^^) X, X e E®. [19] 

'X. 'x< 'X, “V 'X/ -v a. 

The expression in [19] represents a system of m equations in m parameters 

X that may easily be solved for X. This, in effect, is a least squares' 

solution to [18]. The vector of multipliers X is an estimate of the unique 

Lagrange multiplier vector X* at X* and X tends to X*. 

O/ <X. "X/ 'X- 

Haarhoff and Buys were more concerned with computational considerations 
than with convergence or duality aspects of the algorithm. They suggested 
that the problem functions be scaled so that the gradients are of the same 
magnitude and be on the order of ten. In this algorithm o^ i = 1, 2, . . . , m 
are kept constant and in the inner iteration the variable metric method of 
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fk') 

Davidon-Fletcher-Powell (DFP) is used to minimize T(X, X S). The 

2-1 

approximation to [V T] is updated using the DFP update formula (Mil). 

A restoration step is included in the inner iteration and in this step T 
is minimized without using derivatives in a direction that leads to the 

m 

satisfaction of linearized constraints. Other numerical aspects of the 

algorithm, such as the various stopping criteria for inner and outer 

2 -1 

iterations and updating the approximation to inverse Hessian [V T] are 
discussed in reference (HI). 

The elegant idea of local convexif ication of the Lagrangian was first 
introduced by Arrow and .Solow (Al) . They suggested addition of quadratic 
penalty terms to the classical Lagrangian to arrive at a modified Lagrangian 
that was locally convex. They were motivated by adaptation of the 
differential gradient scheme, developed by Arrow, et al. (Al) for 
approaching saddle points of convex programs, to nonconvex programs. Their 
differential gradient method is a small step-size algorithm while those 
of llestenes, Powell and Haarhoff and Buys are large step-size methods. 

In the above contributions to the augmented penalty function technique 
duality concepts were not employed. Primal-dual interpretation of the 
technique was analyzed by Buys (B7), Luenberger (L5), Rockafellar (R12) 
and Bertsekas (B2) , (B3) . A detailed review of the duality results may 
be found in reference (F8) . The duality results are summarized briefly 
in the next section. 

3.3 Review of the Technique for a Constrained Problem with Equalities 
and Inequalities 
3.3.1 Constrained Problem 

The problem P3 with equality and inequality constraints is represented 
as 
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n 


P3: Minimize f (X) , X e L*^E 


'V. 


Subject to c.(X) =0 i=l, 2, ...,k 

1 'b 

c . (X) >0 i = k+1, m, 

1 - 


0 < k < n. 


The real valued functions f (X) and c . (X) Vi are defined on a nonempty open 

'Xj ^ ^ 

set LS Let X* be a local optimal solution to P3. The problem functions 


.( 2 ) 


are of class C 
Lagrangian associated with P3 is 


on L and specifically in an open neighborhood of X*. The 

% 


L(X, A) = f (X) - C(X), X e E™, X e L 

'U'X/ O/'U'U'U'X' 


[ 20 ] 


The following conditions are assumed to hold good at X* (FI) , (M13) . 

'U 

(1) X* is a regular point of the feasible region 
0/ 

F = {X: C.(X) = 0, 1 < i < k and C . (X) > 0, k < i < m} 

^1% == l'V= = 

Let E = {i: 1 f ^ 5 

I = {i: C.(X*) =0, k < i < m}. The X* is a regular point 

1 % ~ 'V 

of F when {VC.(X*)} i c E S I is a linearly independent set, 

Tj 1 'X, 

(2) There exists an unique Lagrange multiplier vector X* e E™ 

a. 

such that the Kuhn-Tucker conditions are satisfied at (X*, X*) 

'\j 'X/ 

C. (X*) = 0 i e E 

1 'X, 

C.(X*) > 0 X.* > 0, X.* C.(X*) =0 i e [21] 

1<X,~ 1 ^'X/ 

VL(X*, X*) = 0 

a- 'X, 'x, 

These are first-order necessary conditions for local optimality 
at X* and (X*, X*) e satisfying [21] is termed a Kuhn- 

'X, 'X< 'X/ 

Tucker point. 



(3) Second-order necessary conditions for local optimality of X* 

a, 

are that in addition to [ 21 ] 

L(X*. X*) Y > 0 VY e StE" [22] 

'V '\j '\i % ~ fy, 

where 

V* = {Y: y”^ VC.(X*) = 0, i e E £r I* and 

'yj 'h X rXj 

y'^ VC. (X*) >0, i e I-I*}, 

'y, '\i X - 

I is the index set of active inequalities, I* is the index set 
of strongly active inequalities and I - I* is the index set of 
weakley active constraints. However the following weaker 
second-order necessary condition is usually assumed instead 
of [22] (FI), (Mil). 

Y^ V^L(X*, X*) Y > 0 V Y e V £. [23] 

V' = {Y: Y^ VC. (X*) = 0, i e E S'!}. 

a, 'v 'V 1 'V' 

(4) Strict complementarity holds at (X*, X*) when 

'X, 'V 

^ 0 for each 1 5 i < ra for which C.(X*) =0. [24] 

1 — — 1 a, 

A weaker form of [24] is 

X.* > 0 and C.(X*) = 0, i e I. [25] 

1 1 'V 

(5) Second-order sufficient conditions for X* to be an isolated 

'V 

local minimum are that in addition to [21] and [23] 


Y^ V^(X*, X*) Y > 0 V nonzero Ye/*. 


[26] 
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However the condition [26] is usually replaced by the verifiable 
condition (Mil) , 

,Y^ v\(X*, A*) Y > 0 V nonzero Y e V. [27] 

'\j '\j % r\j i\j 


3.3.2 Powell - Hestenes - Rockafellar Penalty Function 
The augmented Lagrangian penalty function for P3 is obtained by 
combining the Powell-Hestenes penalty function T and the Rockafellar penalty 
function T. The combined function may be represented as 


T (X, X, o) = f(X) - Z [X C (X) - 4 0. C.^(X)] + 

rrlK r\j 'Xj '\j ieE ^ ^ V Z X X i\j 


leE X 


[28] 


where 

X. X. 

(C.(X) - -i) = min [(C. (X) - , 0] 

1 a . X o . 

, X X 

j-m ^ 

o E fc , X e t . 

'V, * ' "V 


In [28], the factor represents a penalizing threshold for the ith 

inequality constraint. The multipliers X^ VI are unconstrained and this 

is an useful property of the augmented penalty function T . Further the 

function T possesses a number of strong properties not exhibited by the 
PHR 

classical Lagrangian L(X, X). The following properties of T^ make it 

r^J Xj rrlK 

ideal for use in. a primal-dual algorithm for solving P3. 

Let M(X) be the index set of the inequalities that contribute to 

a, 

the quadratic penalty term in T for an estimate, X, of the Lagrange 

PtiK 
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multiplier vector X*. 

'll 


•MU) = {i: 
% 


E, C.(X) < X./oi}. 

1 'V/ 1 


Equivalently,, 


M(X) = {i: X i E, X. - a. C . (X) > 0}. 

'V, 1 1 1 'V, 


At the local optimum (X*, X*) of P3, M(X*) is the index set, I*, of the 

'll '\j 'll 

strongly active inequalities. By , the strict complementarity assumption, 

I =1* and therefore M(X*) represents the active inequality constraints at 


the local optimum (X*, X*). Further the set EUM(X) represents the 

'll 'll 

inactive inequality constraints at the intermediate approximation (X, X) to 

'll 'v 


the solution (X*, X*). Let L = EUM(X). Then, 

h 'll 'X. 


L = 


= {i: i E, C, (X) > X./a.}, 

1 'U — 11 


Equivalently , 


L = U: i i E. - o- C.(X) > 0}. 

fV/l 1 1 -V = 


Using the above results the augmented penalty function T may be 

r HR 

represented as follows. 


T (X, X, a) = f(X) - I (X -70 C (X)) C (X) 
rHK 'll 'll h 'll xeE M(X) ^ ^ 1 1 'V 1 -V 


1 V X 2 , 

f X . /o . . 

2 . , 1 X 

xeL 


[ 29 ] 


[30] 


[31] 


[32] 


[ 33 ] 
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The representation of T in [33] clearly illustrates that it is obtained 

1: HK 

by combining the Powell-Hestenes penalty function T and the Rockafellar 
penalty function T. 

Mangasa^rian (M2) associated a wide class of Lagrangians with the 

nonconvex program P3. The unconstrained stationary points and local saddle- 

points of each Lagrangian were shown to be related to the Kuhn-Tucker points 

or local or global solutions of P3 (M2) , The Lagrangians considered by 

Mangasarian (M2) were twice differentiable globally. The augmented penalty 

function T belongs to the general class of Lagrangians investigated by 

Mangasarian (M2) . However the penalty function T is twice continuously 

JrHK 

differentiable in X except at points where X. - a. c . (X) =0, i e M(X). 

^ 1 1 1 -X, ^ 

By the strict complementarity condition, - o^ c^(X*) 0 for i e M(X*) , 

i.e., i e I, Therfore T is twice continuously differentiable in an open 

irnK 

neighborhood about (X*, X*). 

Mangasarian (M2) established the properties of the general class of 

Lagrangians for P3. As T ^ is a member of this class of Lagrangians, 

PHR 

the following properties hold good for T (M2). These properties of T 

PHK PHR 

also were established by Rockafellar (R6). For o e E?.j (X*, X*) is a 

Kuhn-Tucker point of P3 if and only if it is a stationary point of T . 

PrlR 

2 

For large but finite a., V T is positive definite (M2), (A3) and 

i PHR 


■^PHR <f’ ^ S "pHR «*• r- S "pHR r- 


[341 


V X e E™ X e A where A is some open neighborhood of X*. Conversely, if 
(X*, X*) is a saddle-point satisfying [3A], then X* is a solution of P3 

- 'X, - 

for X e A. 
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A duality theory in terms of extended Lagrangians was presented by 
Mangasarian (M2) . The' augmented dual problem may be represented as 


D3: Maximize g(A, o). 

. >\j 'Kj 


g(X, a) = inf T (X, X, a) 
Xel 


'\j 'll 


[35] 




The augmented dual function g(X, o) is concave in (X, a) and is strictly 

'Kj >\, a. "v 

nondecreasing in a. If the point (X*,. X*) satisfies the optimality conditions 

'Vi 'X, 'X, 

and if a is sufficiently large, then (X*, X*) is an isolated local maximum 

'Xi o> a. 

of D3. Conversely, if (X*, X*) is a global or local solution of D3, then 

'Xj 'Xj 

the optimality conditions for P3 are satisfied at (X*, X*) 

'X, 'X, 

Let X(X) = X(X, a) be an unconstrained minimizer of T„,,„ (X, X, a) for 

'Xj 'Xj 'Xj 'Xj 'Xj PHR 'Xj 'Xj 'Xj 

.X in an open neighborhood of X*. Then the dual function at this point may 


'Xj 


be expressed as 


g(X, 0) = T (X (X), X, o) = T (X). 

iXj 'Xj rnt\ "V 'Xj 'Xj rtlK. % 


[36] 


Useful duality results for multiplier iterations may be summarized as 
follows (F8) . 


9X. 


-c, (X(X)) 


i e E 


-min(C.(X (X)), X./o.) i c E 


[37] 


'Xj 'Xj 


1 1 


Let N be a matrix with Vc.(X), i e E'U’M(X) as columns and G be the Hessian 

'b 1 o- 

oC Then 


^ ^ ^PHR 


T -1 
-N G N 

0 

0 



i E E1/M(X) 

'V/ 


i e ETTM(X) 


[38] 
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Because X* = X(X*, a) for large a, the optimality conditions and the 

Oi O/ 

expressions [37] and [38] imply that T is concave in A for A close to 

rrlR r\j 

X* is a strong unconstrained maximizer of T , . 

^ ■ rrlK 

The above results indicate that the problem P3 may be solved by 

locating a saddle-point of The saddle-point theory and local duality 

results suggest a primal-dual algorithm for solving P3. The algorithm 

consists of inner and outer iterations and is similar to the algorithm A2. 

In the inner iteration, k, for fixed A^^^ and is 

% 'h FHR 'x, f\j 

minimized with respect to X starting the unconstrained minimization from 

% 

X^^ 1)^ initial starting point X^^^ need not be feasible and may be 

V 'X, 

(k) (k) (k) 

chosen arbitrarily. Let X = X(A , a ) be the unconstrained mlnimizer 

^ 'V 'X. 

(k) (k) (k) 

of T (X, A , a ). In the outer iteration o is increased so as to 

L HR ^ • 'V 

Ck') Ck'l fk") 

force (X' , A^ '') into a region about (X*, A*) and A^ is adjusted so 

'Xi ”X/ % '\j % 

as to ensure A -> X* and X^^^ X*. 

'\j r\j '\j % 

The duality relationships [37] and [38] suggest gradient and Newton 
steps for adjusting A in the outer iteration so as to maximize the dual 

, 'Xj 

functioni Mangasarian (M2) analyzed the method of multipliers with a 
gradient step for adjusting A in the outer iteration. 


^(k+1) ^ .^(k) pgj 

r\j 'X/ <XA« rnK 'X, 


He established the linear convergence of this algorithm with exact 
minimizations iri the inner iteration and a large but finite a. He also 
investigated the relation between 3 and the speed of convergence of the 
method. 

The convergence and duality analyses presented by Rockafellar (R6) 
also are valid for the primal-dual algorithm for P3 . . Rockafellar (R6) 
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established the convergence of the algorithm with inexact minimizations in 
the inner iteration. Pierre and Lowe (P2) comprehensively reviewed the 
technique for P3 and presented a numerical algorithm, test problems and 
computational results. In this implementation of the ALAG penalty function 
technique in a numerical algorithm, a simple gradient step for adjusting 
X was used in the outer iteration. 

'\j 

^(k+1) ^ ^(k) 

% "V 'V rnK. '\j 

The penalty parameters were monotonically increased in the outer iteration. 

The linear constraints were also included in the penalty term. A constraint 

with upper and lower bounds was treated as two separate constraints. This 

approach introduces two dual variables for such a constraint. 

Fletcher (F8) suggested second-order X iteration updates. He also 

devised a Newton-like iteration for updating X using estimates of G in [38]. 

a. 

In tlie numerical experiments, Fletcher (F8) used a quasi-Newton method for 

unconstrained minimization of T and built-up estimate of G. The change 

irHK 

in G was accounted for when a was changed. The computational results 
presented by Fletcher (F8) indicate that the Newton-like algorithm for 
updating X is more efficient that the gradient step for adjusting X. In 
these numerical experiments the penalty constants o^ were also adjusted 
(F8). Fletcher (F8) showed that this scheme for adjusting never fails 
to induce convergence of the algorithm and avoids increasing by an 
arbitrary factor of 10. 

Buys and Gonin (B9) performed sensitivity analysis with the aid of 
the ALAG penalty function T Similar sensitivity results were developed 
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by Armacost and Fiacco (A3) using augmented Lagrangian function In 

these analyses the following parametric mathematical programming problem 
was considered. 


(a) : 

Minimize 

f(x. 

a) , 

X 

£ E , 

a e 

% 



'V 



'V 


Subject to 

c.(X, 

a) 

= 

II 

•H 

O 

= 1, 


^ 'V 

'b 






c.(X, 

a) 

> 

0, i = 

= k+1 



1 







VII 

O 

< 

n 




In [41] a is the vector of sensitivity parameters. In these analyses, the 

'\j 

problem functions were assumed to be twice continuously differentiable 
in (X, a) in a neighborhood of (X*, a*) and for some a*, the conditions 

'V< % % 'll 

in 3.4.1 were assumed to hold at (X*, a*, X*). The X* is the vector of 

<v 'V 'V x 

Lagrange multipliers associated with a solution X* to P3 (a*) . 

'V 


[41] 
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IV. NUMERICAL RESULTS 


4.1. Introduction 

Numerical experiments have been conducted to identify the most efficient 
ALAG Penalty Function Technique for computer implementation. These numerical 
exercises, include testing individual unconstrained optimizers and constrained 
optimizers utilizing a wide range of inequality and equality constrained 
nonlinear optimization problems. Phase one of these numerical experiments 
involved testing a number of popular unconstrained optimization algorithms. 

The most effective of these algorithms were then incorporated into ALAG 
Penalty Function routines for the solution of constrained optimization 
problems. 

4.2 Unconstrained Optimizing Algorithms 

Two different classes of algorithms for solving the unconstrained 
optimization problems have been tested on several sample problems. The 
first class of algorithms tested were those that do not require derivative 
functions. These algorithms make use of finite difference approximations 
for derivatives or work solely with the given problem function in seeking 
an optimum. The second class of unconstrained optimizers require explicit 
first derivative functions. The unconstrained optimization techniques are 
identified in the following table and discussed in (L5) . 

These algorithms performance on a number of sample problems is 
described in Table II. Based on the results presented in Table II and 
computer programming considerations algorithms 4, 5 and 7 were incorporated 
into computerized ALAG Penalty Function routines and tested with a number 
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TABLE I 

UNCONSTRAINED OPTIMIZERS TESTED 


Derivative Free Optimizers 

1.. Hooke-Jeeves Pattern Search Algorithm 

2. Powell's Algorithm 

3. Stewarts Adaptation of the Davidon-Fletcher-Powell 

Algorithm 

4. Fletcher's Finite Difference Technique for a 

Complimentary Davidon-Fletcher-Powell Algorithm 


First Derivatives Required 

5. Complimentary Davidon-Fletcher-Powell Algorithm 

6. Davidon's Variance Algorithm 

7. Complimentary Davidon-Fletcher-Powell Algorithm 

(with no line searches) 



NUMBER OF FUNCTION EVALUATIONS FOR THE UNCONSTRAINED OPTIMIZERS 
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of inequality and equality constrained nonlinear optimization problems. 

4.3 ALAG Constrained Optimizing Algorithms 

The selected ALAG routines were tested on many of the example constrained 
problems presented in (B6) • Table III summarizes the computational results 
achieved for these example problems where the algorithrms tested were 

1. ALAG algorithm with unconstrained optimizer 5 (see Table I) . 

2. ALAG algorithm with unconstrained optimizer 7 (see Table I). 


3. 


ALAG algorithm with unconstrained optimizer 4 (see Table I) . 
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TABLE III 

COMPUTATIONAL RESULTS FOR NONLINEAR CONSTRAINED PROBLEMS 


• 

Problem (See Reference (B6)) 

Number of Function 
and Gradient 
Evaluations 

Number of 
Unconstrained 
Problems 

, ■ ■ 

1 

2 

Algorit 

3 

hm 

1 

2 

3 

12-1 

44 

33 

167 

3 

3 

5 

12-3 

19 

24 

96 

2 

3 

4 

12-5 

42 

62 

166 

3 

3 

5 

12-8 

57 

49 

99 

3 

4 

5 

12-10 

42 

31 

72 

2 

2 

4 

12-14 

27 

74 

121 

2 

3 

5 

12-15 

1 

21 

30 

118 

2 

3 

4 

1 

12-17 

80 

120 

* 

5 

9 

* 

12-18 

147 

193 

* 

7 

9 

* 

12-23 

32 

i 

45 

122 

3 

3 

5 

•12-25 

1 

172 

174 

326 

7 

6 

9 


*Did not converge to correct solution 



APPENDIX A 


MATHEMATICAL CONCEPTS AND PENALTY FUNCTION TECHNIQUES 

1. Introduction 

Symbols, mathematical terms and related concepts are defined and briefly 
reviewed in this section. The topics that are directly connected with this 
work are alone considered. The terms and definitions are those commonly used 
in standard books on Nonlinear Programming (H4) , (H5) , (L4) , (L5) , (Ml) . A 
detailed information about the following concepts may be found in the above 
references. 

2. Euclidean n-Dimensional Space 

In this work real-valued functions on a set L in an Euclidean space 

are considered. By an Euclidean space is meant a linear space whose point 

T 

are representable by n-tuples X = (x, , x„ , x ) . The nonnegative 

% 1 z n 

n ^ n 

orthant of E is denoted as and the positive orthant of E. is denoted 

n 

as E I [ . A point is represented as a column vector U6-ing capital letters . 
with underscore X, Y, or lower case letters with underscore a, b, 

i\j r\i '\j'\j 

or Greek letters with underscore a, X, .... The components of a vector are 

real numbers represented by lower case letters with subscript. The set of 

real numbers is denoted as E. The real numbers in E are represented by 

lower case letters a, b, and Greek letters a, B, without subscript 

or with subscript a^, a^, ^ 2 ’ •••• Superscript in parentheses ^ 

used to represent an element of a sequence of vectors or real numbers. 

Subscript -<J> also used to distinguish different vectors X , X , .... 

<\,1 

A linear space E'^ is a set of elements X, Y, called vectors, 

a. 'X, 

for wliich the operations of addition of vectors and multiplication of 
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vectors by scalars a, b, ... are defined and the Euclidean norm of a vector 
is defined as 


X|| . + + 


+ X 


n 


2 ^ 1/2 


Linearity implies that ifaeE,beE,xeE’^ and y e E'^, then ax + by e E*^ 

'V <\j >\, 

A subspace L of E^ is a subset of E^ such that L is a linear space with the 
same operations as those defined in E’^ and with the same scalar field. A 
subspace L of E^ is also called a linear manifold. 


3. Sets 

The set F of elements X in E*^ satisfying a property P(X) is represented 

'Vi ,'\j 

as 


F = {X: P(X)}. 

% % 

A member Y of the set F is denoted as y e F and if Y is not a member of F, 

then y ^ F. The union of two sets A and B in E*^ is the set of elements that 

'Vi 

belong to either A or B. 

A SrB = {X: X e A or X e B}. 

% 'V 

The intersection of two sets A and B is the set of elements that belong to 
both A and B. 

AS=B={X: XeA and X e B}. 

'\j ^ <v 

If every element of A is also a member of B, then A is a proper subset of 
B, i.e., AS^B. If. A B, then A may be a proper subset of B or may be B 
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itself. The complement of a set A is denoted as and it consists of 

elements not in A. If a e E and b e E, etc., [a, b] denotes the set of 

real numbers a < x < b. If x e (a, b] then a < x < b. 

A real-valued function f (X) defined on a subset F of E'^ is represented 

as f (X) : E. The minimization of f(X) over the set F is represented 

as 

Minimize f (X) 

X e F 

'Vi 

If F is the space E*^, then the minimization is unconstrained. Otherwise 
the minimization is constrained. 

4. Linearly Independent Set of Vectors 

A set of m vectors X- , X„ , ...» X is said to be a set of linearly 

Oil '\/4 <Y,m 

independent vectors if a relation, of the form 
a.X, + a-X- + . . . + a X =0 

holds only when the scalars a,, a_, ...» a are all zero. The vectors are 
linearly dependent if they are not linearly independent . A set of n linearly 
independent vectors is a basis for e”. The dimension of a space is the 
number of vectors in a basis for that space. Let a set of m linearly 
independent vectors in E*^ define a subspace B of E*^. The set of all 
vectors in E*^ which are orthogonal to B is a subspace called the orthogonal 
complement of B and is denoted by B^. Any vector X e E^ may be uniquely 

”V. 

represented as X = Y + Z where Y e B and Z e B^. 

0 ,/ ^ 

5. Characterization of Neighborhood of a Point, Sets and Sequences 
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5,1 Neighborhood of a Point 


The e-neighborhood of a point' X* in is the set of points X lying in 

''' ''' 

the open sphere or ball of radius e > 0 and X*. The e-neighborhood of X* = 

{X: II x-X* II* < e}. In general it is not necessary to restrict a neighborhood 

% 'h 

of a point to be an e-neighborhood. Therefore a neighborhood of a point X* 

'Xj 

is defined as any open set containing X*. 


5.2 Nature of a Point X With Respect to a Set F in 

A point X is an interior point of F if F contains an e-neighborhood of 
X^, A point X^ is an accumulation point or a limit point of F if every 
e-neighborhood of X contains a point X / X belonging to F. A limit point 

r\jO i\j i\jO 

of F need not be in F. A point X is an isolated point of F if X is in F 

%o 

but is not a limit point of F. A point X is a boundary point of F if every 

'vO 

e-neighborhood of X^ contains points in F and points not in F. A point X^ 
is an exterior point of F if it is interior to the complement of F. 


5.3 Characterization of a Set in Terms of the Points in it 


A set F, in E*^ is open if all of its points are interior points. 


Equivalently, F is open if given X e F and 3 and e > 0 3 || Y-X || < e 

'\i '\j '\j 

implies Y e F. It is closed if it contains its limit points. Equivalently, 
F is closed if X e F and X X implies X e F. The closure of any set F 

0.1 0^1 fx, 'h 

in E’^ is the smallest closed set containing F. The boundary of a set is 


that part of the closure that is not in the interior. A set F is bounded 


if there exists a positive number r such that || X || < r for every X e F. 

'Xj ~ Oi 

A closed and bounded set is said to be compact. A neighborhood of a set 


F is an open set V containing F. By an e-neighborhood of F is meant a set 


of points each of which lies in an e-neighborhood of some point X in F. 
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The e-neighborhood of F is the union of the e-neighborhoods of its points. 

If A ^ E is a bounded set of real numbers, then the smallest real 
number y such that x < y Vx e A is called the least upper bound or supremum 
of A and is denoted as 

m 

y = sup(x) or y = sup{x: x e A}. 

X e A 

Similarly, the greatest lower bound or infimum y of a set A is denoted as 


y. = inf(x) or y = inf{x: x e A}. 

X e A 


5.4 Characterization of a Sequence 

fk.) “ (k) 

A sequence of vectors is represented as {X^ ^ ^k.~0 ^ when the 

(k) 

index set is implicitly understood. The sequence {X } is said to converge 

% 

to the limit X* if || X^^^ - X* II 0 as k -> ». Equivalently, X* is the limit 

'\i <\j '\j 'V 

(k) 

point of the sequence {X } if for every e > 0 there is an integer p such 
(k) 

that X is in the e-neighborhood of X* whenever k > p. Each of the symbols 


"X 

Or 


(k) 


-> X*", "lim X 

<\j '\j 


= X*" and lim X^’^^ = X* 


'h 


k-x» 


(k) (k) 

signifies that X* is the limit of the sequence {X' . If X^ X* and 

% 'Xj '\i '\i . 

{Y^k)} ^ subsequence of {X^^^}, then X*. A sequence is a 

i\j r\j Oi 'V- . '\j 

Cauchy sequence if 


lim II 
k,s,-H» 


II . 0. 


A sequence {X^*^h in converges if and only if it is a Cauchy sequence. 

Of 
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(k) 

A sequence {X^ } is bounded if there is a finite positive number r such 


a, 


.(k) 


< r for every integer k. A point X* is an accumulation point 

— 'V 


that 11 X 

'V 

(k) 

or a cluster point of a sequence {X } if it is the limit of a subsequence 


A set r in fc is closed if and only if the limits of convergent sequences 
in F are in F. Every bounded sequence of points in possesses a 

'V 

(k) 

convergent subsequence. Let {r' '} be a bounded sequence of real numbers 

and = sup{r^^^; i > k}. Then converges to a real number q* 

called the limit superior of and I/* = lim (r^^^). 

k->«> 


5.4.1 Order of Convergence of a Sequence 
(k) 

Let {r^ } be a sequence of real numbers converging to r*. The order 

(k) 

of convergence of {r^ } is defined as the supremum of nonnegative numbers 

p satisfying 


0 < lim 
k-^ 


I (k+1) ,1 

r - r* 


.(k) 


< «>. 


- r*[p 


This definition of the order of convergence is a step-wise concept as it 

defines bounds on the progress made in moving from kth term to (k+l)th term. 

The order of convergence is determined only by the properties of the sequence 

when k It is a measure of the speed of convergence of the "tail" of 

(k) 

the sequence {r ’ }. A large value of p implies a high speed of convergence. 
If the sequence has pth order of convergence and if 


.(k+1) 


3 = lim 
k-^ 


- r* 


,(k) 


- r* 
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then asymptotically 



When p ■= 2 the sequence has second order convergence. 

(k) 

If the sequence {r' has an order of convergence equal to unity, then 
it is said to converge linearly to r*. The sequence converges to r* linearly 
with convergence ratio 3 if 


lim 

k-^ 



A linearly convergent sequence with convergence ratio 3 is said to have a 

(k) 

tail that converges at least as fast as the geometric sequence {d3^ } for 

some constant d. Therefore linear convergence is sometimes referred to as 
geometric convergence. The smaller the convergence ratio, the faster is the 
rate of convergence. When p = 1 and 3=0, the rate of convergence is said 
to be superlinear. The convergence of any order greater than unity is also 
superlinear. 

The average convergence rates may be used to place bounds on the 
average progress per step over a large number of steps. However in comparing 
convergence of different sequences, the step-wise convergence rates are 
usually used. When the sequences are well behaved and the limits involved 
in the definition of convergence rates exist the step-wise and average 
convergence rates coincide. Additional information on the convergence of 




8 


with respect to a function that converts the sequence of vectors into a 

sequence of numbers. If f (X) : -*■ E is defined on the convergence 

(k) 

of .{X i to X* can be defined in terms of the convergence of f(X) to f(X*). 


% 




(k) 

The function f (X) used in this way to measure the convergence of {X^ } is 

called the error function. 

In optimization theory, the objective function f(X) or the function 
2 

II X - X* II is chosen as the error function to analyze the convergence of 

(k) 

the sequence of intermediate solutions {X } to X*. The order of convergence 

'\j 

of a sequence is insensitive to the particular error function used and hence 
the particular error function used to measure convergence is not really 
very important (L5) . 

The order of convergence of a sequence is a local convergence property 
and is a measure of the ultimate speed of convergence. It is generally used 
to determine the relative advantage of one algorithm to another. The global 
convergence property is concerned with whether starting at an arbitrary 
point the sequence generated will converge to a limit point or a solution. 


6. Matrix Notation 

A matrix with ra rows and n columns is denoted as an mxn matrix. A 

diagonal matrix with n rows is denoted as a diagonal matrix of order n. 

A diagonal matrix with unity as diagonal elements is denoted as the identity 

matrix I. The double subscript notation is used to represent the elements 

of a matrix. A matrix H with elements h. , is represented as H = {h. .}. 

T 

The transpose of a matrix B is written as B . A square matrix is said to 
be nonsingular if its determinant is not zero. The inverse of a nonsingular 
square matrix G is denoted as G A matrix N whose columns are X , X , ..., X 

'X.i <X,z .\,1 
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is represented as N = [X , X , X ]. A vector X e is a matrix with 

'v,2 /\^n r\j 

n rows and one column. A row vector is represented as the transpose of a 
column vector. The determinant of a matrix H is denoted as |h|. 


7. Eigenvalues and Quadratic Forms 


Let H be a square matrix of order n. A scalar X and a nonzero vector 
X e satisfying HX = XX are said to be an eigenvalue and an eigenvector 

'h '\i r\j 

respectively of H. The number X is the eigenvalue of H corresponding to 


the eigenvector X. All the eigenvalues of H are obtained by solving the 

'h 

characteristic polynomial of degree n in X, |h - IX| =0. 

T 

If the square matrix H of order n is symmetric, i.e., H = H , then 


(i) The eigenvalues of H are real. 


(ii) Let X and X be distinct eigenvalues of H and X and X be the 

1 ^ 'vX /\jZ. 

T 

corresponding eigenvectors, then X X = 0. 

»vX f\,^ 

The matrix H is positive definite when 

T 

(a) The quadratic form X H X is positive definite, i.e., 

<V 'V 


X^ H X > 0 V nonzero X e E'^. 

% 'V ''' 


(b) All its eigenvalues are positive,- i.e., X^ > 0 Vi. 

(c) The determinants of the leading principal minors of H are positive 


The leading principal minors of H are represented as 


H. = .{h. , } (i,j = 1, 2, . . . , p) . 

P 


The matrix H is positive semidefinite when 

T 

(a) The quadratic form X H X is positive semidefinite, i.e., 

X^ H X > 0 V nonzero X e E*^ and 

T n 

X H X = 0 for some nonzero X e E . 

*\f ’yj 
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(b) The eigenvalues > 0 Vi and X^ = 0 for at least one but 
not all i. 

The leading principal minor test cannot be used to determine semidefiniteness 
of the matrix. H. When some of the determinants of the leading principal minors 
are zero, the. test will not provide information about the definiteness of H. 

A matrix H is indefinite when 

(a) The quadratic form H X is indefinite, i.e., X^ H X < 0 for 

some nonzero X e and X H X > 0 for other nonzero X e 

'V 

(b) The eigenvalues X^ < 0 for some i and X , > 0 for some j . 

(c) . Let (h^(, i=l, 2, ,..,nbe determinants of the leading principal 

minors of H. The matrix H is indefinite if |h^| ^ 0 Vi and 

|h I/Ih I <0 for some i and |h |/|h ,| > 0 for some i. 

X ' i-1 3 j~l 

8. Norm and Condition Number of a Matrix 


The norm of a square matrix H of order n, subordinate to the vector 

„ „ M „ „ M 

norm || X || , is defined as 1| H || = max - n ^-i i — . The norm || H 1| relative 


II X II ^ 0 


'V 


to the Euclidean norm 


IS 


H II = 


max 


X 

a. 


T T 
^ 0 X X 




Therefore the norm ||.H || relative to the Euclidean norm of a vector X in E 

a. 

T 

is the square root of the largest eigenvalue of H H. If H is a symmetric 
matrix, then || H || is the largest eigenvalue of H and || H ^ || is the 
reciprocal of the smallest eigenvalue of H. Let X and X be the largest 

Xr 'O 

and smallest eigenvalues of H. Then the condition number r of the matrix 


n 
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H is defined as r = X„/X,. The matrix H is said to be well-conditioned if 

Ji, o 

the value of r is close to 1. If the value of r is very large, the matrix 
H is said to be ill-conditioned. The ill-conditioning of H increases as 
the value of r increases . 


9. Functions 


A real valued function f (X) defined on a subset L of is represented 

'Vi 

as f (X) : E. A function f(X) is said to be continuous on a set L if it 

% 'X, . 

is continuous at each point X in L. It is continuous at a point X in L if 

^O 'VO 

f (X) f (X ) whenever X e L and X ->■ X . Equivalently, f (X) is continuous at 
X if given e > 0 there is a 6 > 0 such that || X - X || < d implies 

rXjO i\,0 

|f(X) - f (X ) j < e , A set of real-valued functions c . (X) , i=l,2, ...,m 
a, 0,0 1 o, 

may be regarded as a single vector function C(X): E^ E™. Such a vector 

a, a, 

function is said to be continuous on an open set L S: if each of its 


component functions is continuous on L. 

(k) (k) 

A real valued function f(X) is said to be of class C or f e C on 

0/ 

an open set L Sr E^ if it is continuous and possesses continuous partial 

(k+1) (k) 

derivatives of all orders < k. If f e C on L, it is of class C on 

L. The gradient of f e at X* is the column vector 

'X, 


Vf(X*) = [ 

'Xj 'Xi 




. 

9x -X* 
n a, 


( 2 ) 

If f e C , the- Hessian of f at X* is the square symmetric matrix of order 

'X, 

n denoted as V^f(X*) or F(X*) 

a. 


v^f(x*) = { 


9x. 3x.'^X* " ^^ij^X* 
^ J a, 
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If the vector-valued function C(X): ^ E™ is of class its gradient 

3C. 

at X* is the mxn matrix, VC(X) = { 7 — Vi, j = 1, 2, . . . , n, called 

0» Oj 'Xi oX, A” 

J 

Jacobian of C at X*. If a vector X e E and if the real-valued function 
X C: -> f is of class the gradient of X^C at any point X is 

''' 'V. 'V 

V[X^C] = X. 

'\i '\j '\j . f\j 


T ^2 

The Hessian of X C at any point X is equal to 1 X . V C (X) . 

'\i ^ 'V/ i— 1 ^ 1 'v 

The set of points satisfying the equation f (X) = c, where c e E and 


f: E'^ E, forms a level surface of f. If f is of the form f (X) = 


n 


'V> 


^ a^ + b, a. not all zero, then the level surfaces of f are (n-1) 
i=l ^ 

dimensional hyperplanes and Vf is the normal to the hyperplanes. In general, 

'Xj 

if f e and Vf 0 at X in L, then Vf(X ) is the normal at X to the 

f\j 'bO f\j f\jO 

level surface f(X) = f(X ). If f e d is a direction vector in E*^ 

'Vj 

and F is the Hessian of f, then the directional derivative of f at a point 

T 

X in the direction d is d Vf and the second derivative of f in that direction 

Oi % '\j 'V 

is d"^ F d. 


a, 


.( 2 ) 


Let f e' be defined on an open set L ^ and X e L. In an open 

<vl 

neighborhood of X , f may be represented using the following Taylor series 

'X.l 


f(X) = f(X^) + (X-X.)^ Vf(x ) + I (X-X V^f(X,)(X-X ) 

U-L '\j ^ I\, '\/ '\j'^ 


'Xj 




+ r(X X-X ) 

>V^ '\j 'V*' 


where r(X , X-X,) is the remainder term. The remainder term satisfies the 

•Xj 0»i- 

relation (H4) 

r(X AX) 

lim — ^ ^ = 0 where AX = X -. X, . 

AX-0 II AX II 2 -V ^ -vl 

'Xj r\, 
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Therefore the quadratic approximation to f (X) about X is the Taylor series 

a . '\^1 

f(X) = .f(X ) + AX"^ Vf(X ) + i AX’^ V^f(X.) AX, AX = X - X. . 

'\j' ^ '\j ■ '\ji <v> 'V 'll Ojl 


10. Implicit Function Theorem 

The implicit function theorem is concerned with the conditions under 
which a set of equations g.(X, X) = 0 i = 1 , 2 , . . . , n, X e X e E”'*, 


-n+M 


'\j >\j 


g.: E E Vi can be solved for X as a function of X, i.e., as X(X). Let 

1 'X, '\j '\j '\i 

g^ Vi be continuous and have continuous first and second order partial 
derivatives with respect to X on an open set B E*^^. Let g: ^ 5 *^ 

'Xj fV, 

be a vector-valued function with g. as elements. Let Vg be the nxn Jacobian 

1 i\Aj 

matrix of g with respect to X. 


Suppose that g.(X, X) = 0 i = 1, 2, ..., n and |Vgl 0 at a point 
1 ^ a. 'x/xj 

(X*, X*) in B. Then there exists a continuous function X(X) on a neighborhood 

V 'V 

A S, of X* and a constant e > 6 such that X(X*) = X*, g.(X(X), X) = 0 Vi, 

■V I\j 'Xj >Xj ^ >Xj % 'Xj 

X e A. Further g.(X, X) =0, |1 X - X(X) |1 < e, X e A only when X = X(X) . 

% lO/'V/ A.'V'v % o.'va. 

( 2 ) 

If the functions g.(X, X) Vi are of class C on B, then the function X(X) 

1 'Xj iXj 

( 2 ) 

is also of class C on A. 


11. Local and Global Minima of a Function on a Set 

Let f(X); E*^ E be defined on an open bounded set LSi^E^. A point 

X* e 1 is said to be a relative minimum point or a local minimum point of 

f over L if there is an e > 0 such that f(X*) < f(X) V X e 1 and || X - X* || < e 

'V ~ ^ ^ 

The point X* is said to be a strict local minimum point or strict relative 

'X, 

minimum point or an isolated local minimum point of f over if there exists 
an e > 0 such that 

f(X*) < f(X) V X e L, X # X* and || X - X* |j < e. 

'Xj 'Xj 'Xj >Xj 'Xt 'Xj 
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A point X* e L is said to be a global minimum point of f over L if 
f(X*) < f (X) V X e L. The point X* is said to be a strict global minimum 
point of f over L if 


f(X*) < f(X) V X e L, X X*. 






A point X* is a local (global) maximum point of f (X) over L if it is a local 


a. 




(global) minimum point of -f (X) . , A point that maximizes or minimizes f oh 1 

a. 

is called an extreme point of f on L. 

12. Infimum and Supremum of a Function on a Set 

Let f (X) : E be defined on an open bounded set L The infimum 

a. 

of f on L is the greatest lower bound of f on S. It is the largest number, 

-00 < (X < such that f (X) > a holds for all X e L. It is denoted as 

=■ = . 'l, = 'X, 

"inf f(X)" or "inf f (X) on L" or "inf f(X)". Equivalently, 

XeL 'V 'x, 

a = inf {f (X) : X c U if 


(i) a < f(X) V X e L 

- a. 


(k) 


(ii) there is a sequence {X^ •' } e L such that 


lim f(X^^^) = a 


A point X* in L minimizes f(X) on L if and only if f(X*) = inf f(X). When 










a minimizing point X* e L exists, f(X*) is the infimum as well as the 
minimum of f (X) on L. If f (X) : ^ E is a continuous function defined 


0 / 


0/ 


On a compact set F.^E , then there exists a point X* such that f(X*) = 
inf f (X) on L , 

'V 
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The supremum of f over L is the least upper bound of f on L. It is the 
smallest real number, < g < oo^ such that f (X) < 3 V X e L. It is denoted 

- - 'X, = % 

as "sup f(X)" or "sup f(X)" or "sup f(X) on L". Equivalently, 

XeL 

oi ■ 

sup{f(X): X el} = inf{-f(X): XeL}. 

A point X* maximizes f (X) on L if and only if f(X*) = sup f (X) . 

■V “Xj 'V . 

13. Convex Sets and Convex Functions 

13.1 Convex Sets 

A set Fj£ is said to be a convex set if for every X^ , X e F and 

0.1 0.2 

0 < a < 1, 

. aX, + (1-ct) X- e F. 

'vx 

Geometrically, a set is a convex set if the line segment joining any two 
points in the set lies in the interior of that set. If BX + (1-B) X e F 

%x f\jZ 

for every X , X e F and B e E, then the set F is said to be an affine set 

f\jZ 

or a linear variety. 

The closure of a convex set is convex. The intersection and union of 
any number of convex sets is convex. The null set is assumed to be convex. 
The convex set defined by every convex linear combination of a finite number 
of points in E'^ is a simplex in E^. The convex hull of a set S is the 
smallest convex set containing S. The closure of a convex hull of S is 
the closed convex hull of S. 

13.2 Convex and Concave Functions 
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A function f (X) : E -> E defined on a convex set L is said to be convex 


on L if for every X , X e L and 0 < a < 1, 

'\,i ~ — 


f(aX, + (1-a) XJ < af(X) + (1-ct) f(Xj 




2 ' = 




If for X X , 0 < a < 1, X , X E L 


f(aX, + (1-a) XJ < af(Xj + (1-a) f(Xj, 


^2 


a.-* 


then f (X) is said to be strictly convex on L. A function f (X) is said to be 

'Tj <V> 

(strictly) concave on L if -f(X) is (strictly) convex on L. A positive linear 

'h 

combination of convex functions is convex. 

If f (X) : E*^ -5- E defined on a convex set E^ E*^ is of class on L, 

‘X, 

then f (X) is convex on L if and only if 

a. 


f(Xj > f(Xj + Vf(Xj (X, - X ) 

f\jZ — t\, r\j± f\jL 


for all points X , X e L. 
If for all X , X e L, 

'V/l 012 


f(X ) > f(X ) + Vf(X ) (X - X ) 

OjZ /\yX ^ 'X/Z 


( 2 ) 

then f is strictly convex on L. If f(X) is of class C' ' on a convex set L, 

'V 

then f(X) is convex on L if and only if at each point X e L the Hessian 

Oi ni 

matrix F of f is positive semidefinite. If F is positive definite V X e L, 




then f is strictly convex on L. 


13.3 Convex Sets Defined by Convex and Concave Functions 
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Let f (X) : E be a convex function defined on a convex set L. The 

'X, 

set F = {X: f (X) < a, X e L} is a convex set for every a e E, If f (X) is 

'Xi 'Xj ~ <\j 'Xi . 

a concave function defined on a convex set L, then the set 

m 

F = {X: f(X) > a, X e U 

'X, rXj ~ 'Xj 

is a convex set for every a e E. 

If f(X) is linear or affine, then f(X) < a defines an open half space, 

'\, 'Xj 

f(X) < a defines a closed half space and f (X) = 0 defines an (n-1) dimensional 

0 > ~ 'Xj 

hyperplane. The intersection of a finite number of closed half spaces is a 
convex poly tope. A nonempty bounded convex poly tope is a convex polyhedron. 

A convex set may be defined by linear equalities. However nonlinear 
equalities cannot define a convex set. A detailed treatment of convex sets 
and convex functions may be found in references (H4) , (L5) , (Ml) , (Rl) , (Zl) , 
(Z2). 

14. Penalty and Barrier Function Methods 

Consider the inequality constrained problem PI. The feasible region F 
is defined as follows. 

F = {X: c.(X) > 0, 1 < i < m}. 

The interior of the feasible region F is defined as 
F^ = {X: c.(X) > 0, 1 < i < m}. 

X X f\j 

The exterior of the feasible region F is denoted as F. 
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14.1 Barrier Function Method 

The barrier function method is a transformation technique. The barrier 
function transformation for PI may be represented as 

m 

B(X.u) = f(X) + I P.(C.(X)), u > 0. 

'\j ' X 1 r\, / 


The function B(X,^) is defined so that a barrier is constructed at the 

boundary of the feasible region F. A solution X* to PI is approached from 

the set by modifying the barrier function using the control parameter jx. 

The set F^ is assumed to be nonempty and this means that any boundary point 

of F may be approached from a point in the set F. This also implies that 

the barrier function is not a suitable transformation for equality constraints. 

In the function B(X,U), the second term is the barrier term. For U > 0, 

this term is bounded and is defined continuously on the interval c.(X) > 0. 

1 

Further p^(t) », as t ^ 0_^. , The commonly used barrier functions are (FI), 

(Rll) 


(i) The inverse barrier function p.(c.(X)) = (c.(X)) 

1 1 'V 1 'V 

(ii) The logarithmic barrier function p.(c.(X) = -X,n(c.(X)). 

1 1 'V 1 'Vi 

The function B(X,U ) is defined on F and twice continuously differentiable 
•x, / I 

in F-r- Further B(X,U ) > 0 and B(X,u) “ as c . (X*) -> 0 for any i. 

I '\j I = <v / i 

Therefore a barrier is established at the boundary of the feasible region. 
This barrier prevents a search procedure for locating a solution X* to PI 

r\, 

from leaving the feasible region. As B(X,R) is defined on F and the method 

a. ' I 

operates in F^, the barrier function method is also called an interior-point 
method. If c.(X*) = 0, then as X -> X*, the growth of p.(c.(X)) is controlled 

X • •Xj 1 X 

or cancelled by decreasing pi. 
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The barrier function method may be summarized as follows. Select a 
(k) 

sequence } such that for each k. 


<;*« and lim/« - 0, 

k-»^ 


For each k, minimize B(X, ) to find = X( starting the 

!\j ^ '\j f . 

■unconstrained minimization from X^^ The initial starting point X^^^ 

'b 'V 

must be in F^. The stopping criteria for each unconstrained minimization 


(k-1) 


) I or 


x« - x<"-« 


may be based on f(X^ ) - f (X 

'b 'b 

(k) 

Let {X^ ' } be the sequence generated by the method. Then any limit 

'\j 

point of this sequence is optimal for PI (Zl) . The behavior of B(X,n) may 

'b ^ 

be interpreted in the following way (Rll) . Let c.(X*) = 0 for some i. As 

X 

x^^^ X*, c.CX^^h ^ C rx*) = 0, p.(c.(X^’"^)) ^ «> and B(X^*^\lO ». 

<b 'bi^ i xi/b b' 

(k) 

However if U is decreased, then p (c (X )) can be allowed to increase 
' 1 i <b 

(k) (k) 

without increasing B(X ,^) . The monotonically decreasing sequence 

is chosen in such a way that 
(k) (k) 

(i) B(X , ^ ) monotonically decreases. 

b 

(k) 

(ii) B(X, 11 ‘^) is twice continuously differentiable in F . 

b I 

(iii) c.(X^^^) ^ 0, X^^^ X*. and f(X^^^) f(X*) 

X ^ 






-(0) 


As the search for X* is started at X' ^ e F , the barrier at the boundary of 

'V/ ^ i 

(k) 

F restricts the search procedure and the sequence, {X^ of minimizing 

b 

fk) 

points of B(X, JLO ') to the interior of F. The method is therefore called 

b . . 

an interior-point method. 

The strengths and weaknesses of the method are discussed in detail 
in reference (Rll). The method facilitates the solution of PI using an 
unconstrained minimization technique and the constraints need not be 
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accounted for explicitly. The convergence of the method has been established 
(FI) when the problem functions are continuous and X* is at the boundary of 

'V 

F or in the closure of F^. . Fiacco and McCormick. (FI) established that there 

(k) 

exist a sequence ) and a corresponding sequence of minimizing points 

(k) 

generated by the algorithm such that X' X* as k ->■ <». Similar convergence 

properties and convergence of the other related sequences have been proved 

by Luenberger (L5) and Zangwill (Zl) . 

The method does not require very strong constraint qualifications and 

it converges to a local minimum of PI where the Kuhn-Tucker conditions may 

(k) 

or may not hold. By monitoring the convergence of the sequences (C(X^ )} 

(k) 

and {X structural information about the problem PI may be obtained. 

The most commonly sought structural information is the set of active 
(k) 


constraints at X 
% 


(k) (k) 

The vector X is an estimate of X of the 


'V 


Lagrange multiplier vector X* at X*. The method converges even when the 

(k) 

minimization of B(X, u. ) is inexact for each k (Rll) . 

The weaknesses of the barrier function method are of a computational 

nature and are most serious when the controlling parameter^ is small. The 

numerical difficulties associated with the algorithm arise due to the ill- 

(k) 

conditioning of the Hessian of B(X, u ). The condition number of the 

'V ' 

(k) (k) 

Hessian of B(X, ^ ) increases as decreases. This causes B(X, n ) to 

f 'X; ” 

have steep-sided valleys and makes the search for an unconstrained minimum 

(k) (k) 

of B(X,^ ) difficult. In the algorithm,^ is gradually decreased so 

(k) 

as to make B(X, y ) twice continuously differentiable and to reduce the 

(k) 

ill-conditioning, of the Hessian of B(X, U ). The feature that restricts 
the general application of the method is that it requires the initial point 



to be feasible and the search for is as difficult as the problem to 

be solved. Further the method cannot handle equality constraints. 

14.2 Penalty Function Method 

The penalty function transformation for PI may be represented as 


m 


P(X, a) = f(x) + a E n,(c.(X)), a > 0. 


The properties of the loss functions n.(c.(X)) and P(X, a) are discussed in 

1 1 'V 

detail in Chapter 2. Additional information may be found in references (FI), 
(L5) , (Zl) , (Z2) . The penalty function designed to impose an increasing 
penalty on the objective function as the search point X moves away from F 
and the constraint violation increases. The loss functions n^(t) are defined 
■for -00 < t < «> and therefore the penalty function is defined on E^. This 
implies that both equality and inequality constraints can be handled by the 
penalty function transformation technique. When X e F, the loss term is 

'\i 

zexo and when X ^ F penalty is imposed on F(X) depending on how far X is away 

'X, 'V 

from F. Therefore the algorithm may be started at any X^*^^ e and specially 
X^^^ e F or X^°^ e F. 

'\j 'X, 

The loss functions n . Vi are usually chosen so that P(X, a) is twice 

1 

differentiable. However the following loss functions also are used in 
some algorithms. 

(i) Zangwiil's loss function for inequalities c . (X) > 0 

n. (c.(X)) = -min (0, c.(X)) 

1 1 'V 1 O" 

(ii) Absolute value loss function for equalities c.(X) = 0 

1 
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The basic penalty function algorithm is described in Chapter 2. 

(k) 

The use of monotonically increasing control parameter a in the 

(k) 

algorithm may be interpreted as follows. When ' is in F, the increase 

<“> 1, 

o' ' Z n^(c^(X)). Due to this increase in the penalty weight associated 


in o'' increases the penalty weight associated with the loss term 




with the loss term, in the subsequent unconstrained minimization of 

(k+1) (k) 

P(X, '^) the loss term is reduced and hence c.(X' 0, permitting 

a. ■ 1 -v 


X 


(k) 


X*. The structure of P(X, a) also implies that for large a, the 


minimum of P(X, a) will be in a region where a I n.(c.(X)) is small. The 

'h ^ 1 1 'V 

gradual increase in a is designed to make P(X, o) continuously differentiable 

'Vi 

and reduce the ill-conditioning of the Hessian of P(X, a). 

(k) 

The convergence of X to X* and the existence of a corresponding 

(k) 

monotonically increasing sequence {o } have been established by Fiacco 
and McCormick (Fl) , Luenberger (L5) , Zangwill (Zl) . The condition number 


of the Hessian of P increases as a increases. The penalty function P(X, a) 


forms increasingly steep-sided valley as a increases and this leads to 

numerical instabilities in the unconstrained minimization of P(X, a). Due 

a. 

to this reason, it is not possible to solve PI in one step via P(X, a) by 

'Xj 

choosing a large a. The gradual increase in a makes the successive 
unconstrained minimization problems easily to solve. In the penalty 
function method the solution X* is approached from outside F and therefore 
the method also is known as the exterior-point method. Lootsma (L3) has 
comprehensively reviewed and classified the loss functions and barrier 
functions. Duality analysis of the methods is developed in references 


(Fl), (L5) and (Zl) . 
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14.3 Mixed Interior Point - Exterior Point Method 
Fiacco and McCormick (FI) proposed and developed a mixed interior 
point - exterior point method for solving P3. The equality constraints 
are handled by the penalty function method and inequalities are taken into 
account using the barrier function method. The methods that solve a 
constrained problem by sequential unconstrained minimizations were termed 
Sequential Unconstrained Minimization Techniques (SUMT) by Fiacco and . 
McCormick (FI). The most popular form of SUMT uses a quadratic loss function 
to handle equalities and a logarithmic barrier function for inequalities. 

P(X, a) = f (X) = - Z in c. (X) + o Z (c (X))^ o > 0. 

^ ° i|E ^ ±zE ^ 

In the above function E is the index set of equality constraints. The 

properties of the mixed function are the same those reviewed above for 

(k) 

penalty and barrier transformations. The sequence X^ converges to X* 

'V 'V. 

when o ->■ <». Additional information about the properties, convergence and 
computational considerations of the mixed methods may be found in reference 
(L3), (FI). 

15. Duality Theory and Duality Gap 
15 . 1 The Primal Problem 

Let the primal problem be defined as follows. 

P; Minimize f (X) , X £ L 

i\, r\, 

subject to c . (X) >0 i = 1, 2, ..., m 
X “ 

f (X) : E, c , (X) : E*^ E Vi. 
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The problem functions are defined on the nonempty open convex set L. The 
problem P is assumed to have at least one feasible solution and the set 


{X: 


X £ L, c, (X) > a, } 

^ 'll 1 


in is compact and nonnull for every choice of e E^. These assumptions 

imply that a finite optimal value of P is attained in the feasible region 
F (R13) . Equivalently, < min (P) < The optimal value of P, in general, 
is inf (P) . Equivalently, the optimal value of P is the inf f (X) subject to 

h 

X £ L and c . (X) > 0. However, if X* is a minimizer of P in F, then min (P) = 

■X, 1 

inf (P) . The conditions imposed on P imply the existence of a solution X* to 

'll 

P. Therefore in subsequent discussions the optimal value of P is denoted as 


min (P) . 


The classical Lagrangian, L(X, X) , associated with P is defined as 

'll '\j 


follows (R13) 


L(X, X) : e"'^ ^ E 


f'fix) - C(X), X e E“ 


L(X, X) 




'll 'll 'll 'll 'll 


otherwise 


Since f(X) > L(X, X), sup L(X, X) = f(X) when X is feasible. The optimal 

~ 'll 'll 'll 'll h 'll 

value of P also may be represented as 


min (P) = inf (P) = inf 

XeL 

'll 


sup 

XeE" 

'll 


L(X, X) 

% 'll 


A vector X* is a Kuhn-Tucker vector for P if 
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inf (P) = inf L(X, X*) . 

'h 'h 


The optimality in P may be characterized by the general saddle point condition, 


X* E E“ 

'\j 


Min L(X, X*) 
XeL ^ 


15 . 2 The Dual Problem and Duality Gap 


The dual problem is defined as 


D: Maximize v(X) , X e E”* 

'V' 'V 


v(X) = inf L(X, X) 
XeL 

'Xj 


The optimal value of the dual is 


sup 

(D) = 

sup 

Xe 

inf 

XeL 

'Xj 

L(X, X). 

Since, min (P) 

= inf 

sup 

L(X, 

X) , min (P) > sup (D) or in general. 


XeL 

XeE 




inf (P) > sup (D) . If inf (P) > sup (D) , a duality gap is said to exist 
between the primal problem P and the dual problem D. If there exists a 
at which the maximum in D is attained, then sup (D) = max (D) . If X* 
solves D and min (P) = max (D) , then X* is a Kuhn-Tucker vector of P. 

'Xj 

15 . 3 Global Optimality and Primal-Dual Method 

The necessary condition for optimality may be expressed as follows. 


= L(X*, X*) = Max L(X*, X) . 

'Xj 'Xj 'Xj 'Xj 

XeE"" 


- 25 - 


■?>- 
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If X* is a global minimum of P and min (P) = max (D) , then the above saddle 
point condition holds. The sufficient condition may be reformulated as 
follows. If X* satisfies the above saddle point condition, then it is a 

global min of P and min (P) - max (D) . Further the vector X* in the saddle 

• . 

point relation is a global maximizer of D. This vector X* is a Kuhn-Tucker 

'X, 

vector for P. 

The saddle point condition is always sufficient condition for optimality. 
However it is a necessary condition that is required to establish the duality 
relation min (P) = max (D) . This duality relation is equivalent to the 
existence of a Kuhn-Tucker vector X* of P. The primal-dual methods exploit 

'V. 

this duality relation to solve the associated nonlinear problem. In the 
ideal case, the dual function v(X) may be maximized to get X* and then 

'\i '\j 

L(X, X*) may be minimized to get X*. This method of solving P is possible 

only for some simple problems. The numerical algorithms based on the duality 

(k) (k) 

relationship generate a maximizing sequence {X^ for D and for each X , 

% % 

(k) (k) 

generate X^ as a solution to min L(X, X ). The sequences are generated 
(k) (k) 

so that X X* and X^ ' ->■ X*. The saddle-point condition may be used to 

design primal-dual numerical algorithms for solving P only if the duality 
relationship min (P) = max (D) holds. The satisfaction of this duality 
relationship depends on the nature of problem functions and the fom of 
the Lagrangian function L(X, X) associated with P. 

15.4 Convex Duality 

If P is a convex, program then the compactness assumption is fulfilled 
when the set 


{X: X e L, C(X) > 0} 

'\j '\j 
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is compact and nonnull. The duality theory for convex programs has been 
reviewed in detail by Geoffrion (Gl) and Rockafellar (R12) , (R13) . For a 
convex program a Kuhn-Tucker vector A* ususally exists and the saddle-point 

•\t 

condition is always sufficient for the optimality of P at X*. Rockafellar 
• -v 

(R13) established that for a convex program P, min (P) = sup (D) (R13) . 

The point (X*, A*) is a saddle-point of L(X, A) on Lx E™, if and only if 

'Vi 'Vi 'V 

X* solves P and X* solves D. If A* solves D, then for X* to solve P it is 

^Vi '\j • 'Vi 

necessary and sufficient that (R13) 

(i) X* minimizes L(X, A*) on X e L 

'Vi '\j '\j 'Xi 

(ii) c.(X*) >0, A* > 0 for c.(X*) =0. 

i • i i % 


Geoffrion (Gl) and Lasden (LI) presented the computational applications 
of the duality theory for convex programs. Several other possible "duals" 
of P have been proposed using the Lagrangian function L(X, A) = f (X) - 

'Vi 'Vi 

T tq 

A C(X), A e E , . The following dual formulations are reviewed and 
compared by Geoffrion (Gl) . 

(i) Geoffrion dual G 

G: Maximize { inf (f(X) - A^ C(X))} 

„ Y o ( ''' -V 'V 'V 

A e E 'v, 

'V/ + 


(ii) Wolfe dual W 

W: Maximize f (X) - A^ C(X) 

r\j 'V 'V 'V 

A > 0 


X e L St 

'\j 

Vf(X) - E A_, Vc_, = 

'Xi ^ ^ i -V 1 


0 
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(iii) Stoer, Mangasarian and Ponstein dual 

SMP: Maximize f(X) - C(X) 

X > 0 

Oi = 


Subject to 

X minimize f(X) - X^ C(X) over L. 


15.5 Duality in Nonconvex Programs 


The dual formulation D of P is based on L(X, X) and has inherent limitations 

<V, 'h 

(R13) . The implicit feasible set in D is 


m 

{X: X E E , v(X) > 

Oi '\j Oi 


and it is difficult to determine a representation of this set. This implies 


that it is difficult to determine whether the inf L(X, X) over X e 1 is 

'X# 'V o.# 

finite and attained.' Further even if X* minimizes L(X, X*) and X* solves D, 

,i\j 'h \i 'V< 

X* may not solve P unless there is only one solution to P. The dual formulation 


D is meaningful only in the convex case, since only in this case it is possible 


to establish the relation min (P) = sup (D) (R13) . 


Rockafellar (R12) , (R13) and Mangasarian (M2) showed that by associating 


a different Lagrangian with P, the duality gap in nonconvex programs may be 


eliminated. A wide variety of Lagrangians may be associated with P and each 


choice corresponds to a different dual problem. Even though great flexibility 


is afforded by the theory in the choice of the Lagrangian for P, not all of 


these are of practical value in computation. The Rockafellar ' s augmented 


Lagrangian f(X, X, o) is a member of a wide class of Lagrangians and has 
'U 

proved to be useful to developing primal-dual numerical algorithms for 


solving P. The duality theory in terms of 'J'(X, X, a) for nonconvex programs 

'Xf *X/ 'Xi 
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is reviewed in Chapter 3. Detailed analysis of the duality theory based on 

"fCX, X, o) may be found in (R6) , (R12) and (R13) . The duality theory based 
a< % n/ 

on .'l'(X, X, o) for convex programs was investigated by Rockafellar (R4) , (R5) 
% % 

15.6 Partial Duality 

It is not necessary to include the, Lagrange multipliers of all the 
constraints of a problem in the definition of the dual function (Gl) , (L5). 
The duality can be defined with respect to any subset of the constraints. 

If a constraint is used to define the Lagrangian associated with P, it has 
a dual variable of its own. If a constraint is assigned to define the set 
L, it will not possess a dual variable. Consider the convex problem P with 
the constraints partitioned so that the dual is defined with respect to the 
constraints belonging to the index set J. Let p be the number of indices in 
J. Then the partial dual of P in terms of L(X, X) and with respect to the 
set J may be represented as 

PD: maximize v(X), X e 

v(.X) = inf L(X, X) X e L, X e 

<\, '\f 

c . (X) >0 i i J. 

1 a. “ 

The choice of assignment of a constraint depends on the structure of the 

problem, or the nature of the theoretical analysis or the ease of evaluating 

v(X) . 

a. 
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APPENDIX B 

COMPUTERIZED ALAG ALGORITHM AND APPLICATION 

1. Introduction 

• 

An ALAG penalty function algorithm to solve the equality and 
inequality constrained problem P3 (see Chapter 3) is presented. An 
equality constrained problem and an inequality constrained problem are 
solved using this numerical algorithm. The algorithm and the examples 
supplement the review of the ALAG penalty function technique reviewed 
within this report. This numerical algorithm was investigated by 
Fletcher (F8) . This algorithm incorporates the parameter iterations 
that have been proven to be efficient (F8) . A Quasi-Newton method that 
utilizes a complimentary Davidson-Fletcher-Powell update [F4] for. solving 
unconstrained problems is used in the inner iterations, 

2. ALAG Penalty Function Algorithm for Equality and Inequality 
Constrained Problem. 

The equality and inequality constrained problem P3 is defined in 
section 3.4.1. To simplify the presentation of the numerical algorithm 
in the next section, following notations are used. 

E : The index set of equalities 

E = {i:l£i£k} 

Sc. : The scale factor for ith constraint 

1 

(k) 

WW. : The scaled constraint violation for ith constraint in 

1 

iteration k 

^ ^‘^i I E and < 0^^^^ 
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WW 


\ ■ fr n^ 

min(C^ , 0) 


Sc. 

X 


1 ^ E and C. > 0 
^ 1 = i 


AKK 


(k). 


AK 


(k) 


The largest scaled constraint violation in iteration k 

r 


AKK 


(k) 


i inT 

= max i WW . V 

i I ^ 3 


(k) 

Initial value of AKK in iteration k 


AKMIN : The relative error tolerance required in the constraint 

('k') 

residuals c.. When AKK'' < AKMIN the algorithm is 

X 

terminated. This is the stopping criterion for the 
outer iteration. 


EPS. 

X 


c . 

X 


(k) 


The tolerance in x. for unconstrained minimization 

X 

The current value or residual of ith constraint in 
iteration k. 




The index set of constraints that contribute to the ALAG 


(k) 

penalty function. M(^ ) 


-b- 


i e E or 


i ^ E and c.^^^ < 9 . 

^ XX 


(k) 

p : The number of indices in M(X ). 

'V 

(k) (k) 

(AX^)p^'^ : The Powell-Hestenes correction for in kth iteration 

(AA.)^^^: The Newton correction for A . in kth iteration. 

X N X 


Algorithm A3 
(i) Select 


the initial starting point X 


( 0 ) 


the initial estimate of parameter vector 9 
the initial penalty constants ^ i 


( 1 ) 
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(ii) 


k = k + 1 


Ck') (k') 

(iii) Minimize 0 (X, 0'' , o' ) to find 

^(k) „ ,^(k) (k), . , . j 

X ~ ^ ® ^ > starting the unconstrained 


minimization from X 


(k-1) 


Use Broyden's Quasi-Newton method for unconstrained 

minimization of 4> (X, 

'X-’ n, ’ ^ ' 

<J> (X, = f(X) + 1/2 E [C. - 

''x.’ ^ ’a, ^ X 1 i 

ieE 


+ 4 S (C. - 0 /'^^) 

’ 1 11- 


if E 


<^1 - » i ). - 


0, C. - 9. > 0 

1 1 = 

(C. - 9.), C. - 9. < 0 
1111 


During the unconstrained minimization of $, an estimate of the Hessian 
of $ is built-up using the first order information about f (X), c. (X) 

'V 1 

(k) (k) 

and the change in X. The estimate of the Hessian of $ at (X ,9 , 

(k) (k) 

a ) is represented as G(o ), 


(iv) Estimate 

. T 1 • 1 • • -V (k) (k) ' (k) 

the Lagrange multiplier estimates 9^ 


(v) 


the constraint residuals c 


(k) 


the scaled constraint violations WW. 

1 


(k) 


(k) 


the largest scaled constraint violation AKK 
fk') 

If AKK' ^ < AKMIN, stop, 
fk") fk") 

If AKK > AK' go to (viii) . Otherwise go to (v) . 

Estimate (AA.)^^^ 

1 PH 




(k) ^ (k) . ^ 
o, C. leE or 
1 1 


iiE A. / 0, C. < 0, A. 
^11 1 


C > 0 
1 .1 = 




(k) 


A. i i E, A. / 0, C. <0, 
1 ^ 11 


, (k) _ , (k) ^ (k) ^ ^ 
1 11 
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the constraint tolerance AKMIN 


the tolerance EPS. on variable x. 

1 1 


the constraint scale factors Sc. 

X 


the Initial upper bound on constraint violation AK 
k = 0 

If k = 1, go to (vi) . 


( 1 ) 


(vi) 


If 


„ (k) (k-1) 

A . “ X . 

1 1 


< EPS^, stop. 


Otherwise go to (vi) . 

Find R=|^i : icEori^E and 


A. ^0 and c. < oT 

X X -> 


Let p be the number of indices in R and 

(k) _ (k) (k) ■ (k) p 

Y - (Y^ , Y^ , Yp ) e E 

Estimate = (AA.)^J^\ i e R. 

X X N 

(k) 

The Y^ , i e R are determined by solving the following 


subproblem. 


Min Q(Y^^^^) 


Y.(^>>-A.(^>, isR 

X = X 


q(y/^^) = • I C. ^ Y^*^^ (N^ G ^ N) Y^^^ 

i^R X X 2 -V 1 -I 

where G is the estimate of the Hessian of $ and the 


columns of N are the gradients of c^, i e R. 
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(vli) icR. 

1 XI 


(k) 


= 4 


(4Xi)<|;> - (4X^)‘« 


fAy 

(AAi)pH 


If Z.(k) > 1, and 

1 1 XX XX 


X e R 


If < 1, and d/’^^ = 0, i e R. 

X = ’ i X X 


(k+1) 


(k). 


x(k)„T 


GCa""" = G(o^ ') + N D" 'N 


- a,-. a a a 

D . = dxag (d^ , d^ , d^ ) . 


^(k) . ^^(k) 


go to (ix) . 

(vixi) Find D = ^ i : WW^ 


<« > Ax'^or 


X i J 


set . lO o/'^’ and X . - X . <« 

X XXX 


_, , . (k) . j (k) _ (k) . ^ j 

The change xn a . xs d . = 9 a . , x e D , and 

X X X 


d^^^^ = 0, i ^ D. Let be the diagonal matrix 


V 


(k) (k) 

with d. as elements. The estimate G(o ) of the 

X 'V 

Hessian of $• is ' adjusted to account for the change in o 


(k) 


as follows. 


= G(o‘W) + stk) p(k) ^(k)I 
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(k) 

The columns of N are the gradients of constraints whose 
indices. are in D 


If 


X,<« - 
1 1 


< EPS^, stop. 


Otherwise go to (ix) . 


(Ix) . X V 

1 1 i X 


go to (ii) . 


3. Numerical Examples 


3.1. Example 1: Equality constrained Problem 

Minimize : f (X) = (x - 1)^ + (x - x + (x„ 

''j 1 1 z / 


X3>' 


+ (X3 - x^) + (x^ 


X5) 


O “X 

Subject to c, (X) = X, + x„ + x» - 2 - 3 ^ = 0 

± ^ i Z j 




c^ (X) = x^x^ -2=0 


Starting point X^^^ = (2, 2, 2, 2, 2) 


Solution point X = (1.1911, 1.3626, 1.4728, 1.635, 1.679) 

A —2 

Optimal objective function value f = 7.8776 X 10 


The relative error tolerance in constraint residuals AKMIN = 


The error tolerance in variables EPS. = 0.00001 V i 

X 


0.0008 
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Outer Iteration 1 


X° = (2, 2, 2, 2, 2) 


9 ^^^ = ( 0 , 0 , 0 ) . 


SC = (7.75736, 1.0, 


.( 1 ) 


o' ' = (0.03323, 2.0 


= 10 X 10^^ 


2 . 0 ) 

, 0.5) 


Inner iteration 


X^^^ = (1.15955, 1.28716, 1.38550, 1.A6505, 1.70426) 
= (-.76667, 0.00417, -0.023827) 

= (0.09883, 0.00417, 0.01191) 

AKK^*^ = 0.09883 
Updating of parameters 


3 Active Constraints i = 1,2,3 

= (0.08186, - 0.06302, -.12599), = 0 

Increased to 0.29414 
02^^^ increased to 52.44928 



increased to 23.15088 
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Outer Iteration 2 

= (1.15955, 1.28716, 1.38550, 1.46505, 1.70426) 
9^^^ = (.27830,* -.00120, -.00544) 

= (0.29414, 52.44928, 23.15088) 

= 0.09883 
Inner iteration 

= (1.19807, 1.37601, 1.48774, 1.66416, 1.66479) 
= (0.14175, -.00162, -0.00546) 

'Vi 

= (0.01827, 0.00162, 0.00273) 

= 0.01827 

Updating of parameters 

= (0.08186, -0.06302, -0.1259,9) 

AA*'^^ = (-0.04160, 0.06244, 0.10922) 

'Xf 

3 active constraints i = 1,2,3. 

( 2 ) 

increased to 55.921 
Outer Iteration 3 

= (1.19807, 1.37601, 1.48774, 1.66416, 1.66479) 

'Vi 
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0^^^ = (.13687, -0.00001, -0.00072) 
o^^\ = (0-.29414, 55.921, 23.1509) 

= 0 . 0182 ^ 

Inner iteration 

= (1.1914, 1.36313, 1.47324, 1.63544, 1.67807) 

= (0.00452, -0.00031, -0.00074) 

'Xj 

(3) (3) 

WW^ ^ = (0.00058, 0.00031, 0.00037), AKK^ ^ = 0.00058 
% 

f (X^^^ ) = 0.07895. 

This is the optimal solution for specified stopping criterion 
AKMIN = 0.0008. 

3.2 Example 2: Inequality constrained problem 

Minimize f (X) = 2 - (x^x^x^x^x^) 

Subject to c^ (X) = x^ > 0 i = 1,2,...., 5 

*^i+5 " i - x^ > 0 i = 1,2, ,5 

Starting point X^^ (2, 2, 2, 2, 2) 

* 

Solution point X = (1,2, 3, 4, 5) 

ft 

Optimal objective function value f =1.0. 

The relative error tolerance in constraint residuals AKMIN = 0.0008 

The error tolerance in variables EPS. = 0.0001 

1 
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Outer Iteration 1 


X" = (2, 2, 2, 2, 2) 


= 0 
'I. 


Sc. = 1.0 V i 
1 


0^^^ = 3.46667 V i 


= 10 X 10^^ 


Inner Iteration 


= (1.35159, 2.21458, 3.15082, 4.11547, 5.0933) 

= (0,0,0,0,0,0,35159, .21458, .15082, .11547, .0933) 


= .35159 


Updating of parameters 


5 Active constraints i = 6,7,8,9,10. 


= (0,0, 0,0,0, .99039, .4847412, .3084211, .2188432, 


increased to 4.83063 


( 1 ) 


increased to 5.6865 


( 1 ) 


increased to 6.2857 


( 1 ) 


10 


.1978085) 


increased to 5.3862 
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Queer Iteration 2 

= (1.35159, 2.21458, 3.15082, 4.11547, 5.0933) 

= (0,0, 0,0,0, .28569, 0.10035, 0.05424, 0.03482, 0.03673) 

• 

( 2 ) 

^ = (3:46667, 3.46667, 3.46667, 3.46667, 3.46667, 3.46667, 

4.8306, 5.68664, 6.28569, 5.3862) 

= 0.35159 


Inner iteration 

= (1.00438, 2.004, 3.0049, 4.0054, 5.00085) 

( 2 ) 

WW^ ^ = (0,0, 0,0,0, 0.00438, 0.00395, 0.00486, 0.005399, 0.000854) 

= 0.00539 

Updating of parameters 

= (0,0, 0,0,0, 0.99039, 0.48474, 0.308421, 0.21884, 0.197808) 

'Xf 

AA^^^ = (0,0, 0,0,0, 0.01006, 0.01533, 0.025028, 0.031898, 0.00273) 

5 active constraints i = 6,7,8,9,10 



increased to 4.68405 


increased to 8.76983 


Outer Iteration 3 

= Cl. 0043, 2.0039, 3.0049, 4.0054, 5.0009) 

% 



9 ^^^ = (0,0, 0,0,0, 0.2136, 0.1035, 0.05864, 0.03989, 0.02287) 

= (3.46667,. 3.46667, 3.46667, 3.46667, 3.46667, 

4.68405, 4.8306, 5.6866, 6.2857, 8.7698) 

= 0.00540 

Inner iteration 

= (1,2, 3, 4,5) 

= (0,0, 0,0,0, 0.00011, 0.000031, 0.000031, 0.00013, 0.000067) 

'Xf 

f (X^^^) = 1.00018, = 0.00013 

This is the optimal solution for specified stopping criterion 
AKMIN = 0.0008. 
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APPENDIX C 

COMPUTER PROGRAM DOCUMENTATION 

This particular section of the report contains the pertinent 
documentation for the computer programs designed and implemented in 
conjunction with this research grant. Three different computer programs 
were developed all based upon the Augmented Lagrangian Penalty Function 
technique for Nonlinear Programming. These programs differ from each 
other primarily as a function of the type of unconstrained optimizer 
used. These programs are entitled ALAGl through ALAG3. ALAGl and ALAG2 
require closed form gradient equations for the functions to be optimized. 
Whereas ALAG3 does not require gradient information be supplied by the . 
'user . 


TABLE I. Unconstrained Optimizers for ALAG 
Computer Programs 


Computer Program 


ALAG 1 


ALAG 2 


Unconstrained Optimizer 


Fletcher algorithm using a quasi- 
Newton complimentary Davidon-Fletcher- 
Powell update formula (P4). 

Variable metric method without line 
searches as proposed and analyzed by 
Powell CP5) 


ALAG 3 


Same method as ALAGl except derivatives 
are estimated by differences 
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COMPUTER PROGRAM; ALAG 1 

LANGUAGE : FORTRAN 

TECHNICAL REFERENCES: (F8) , (P4) 
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1. PURPOSE: 


2. USE: 


ALAG 1 

To minimize a function F (x) = f (X, > . . . , X ) 

“■ 1 n 

subject to both equality and inequality constraints. 
Derivatives of all functions must be supplied in a 
user subroutine entitled ALAGB (see item 5). An 
initial estimate of the solution (not necessarily 
feasible) must be specified. This computer program 
is developed from algorithm of section 

CALL ALAGl (N ,M,K,X,EPS , AKMIN, DFN, MAXFN, IPRl, 

IPR2, IW, MODE) 

N An INTEGER set to the number of variables 

n (N ^ 2) . , 

M An INTEGER set to the total number of 

constraints m (M ^ 1) . 

K An INTEGER set to the total number of 

equality constraints k. 

X A REAL array of N elements in which the initial 

estimate of the solution must be set. ALAGl 
returns the solution x in X, 

EPS A REAL array of N elements, in which the 

tolerances for the unconstrained minimizations 
must be set. EPS (I) should be set so that 
EPS (I)/X (I) = AKMIN, roughly speaking. 
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AKMIN 


DFN 


MAXFN 


IPRl 


IPR2 


IW 


MODE 


A REAL number in which the relative error 

tolerance required in the constraint residuals 

must be set. ALAGl will exit when max{ I c . (x) I / 

' 1 

scaling factor for c^} ^ AKMIN for the active 
constraints {i}. 

A REAL number in which the likely reaction in 
F(x) must be set. This is done in the same 
way as for QNWTA - see the QNWTA description. 

An INTEGER in which the maximum number of calls 
of ALAGB on any one unconstrained minimization 
must be set. 

An INTEGER controlling the frequency of printing 
from ALAGl. Printing occurs every IPRl iterations, 
except for details of Increases to the c^ which 
are always printed. No printing at all occurs 
(except for error diagnostics) if IPRl = 0. 

An INTEGER controlling the frequency of printing 
from QNWTA. IPR2 should be set as described in 
the QNWTA documentation. 

An INTEGER giving the amount of storage available 
in COMMON/ ALAGL /W (.) . Set to 2500 unless wishing 
to change the restrictions (see Section 5) . 

An INTEGER controlling the mode of operation of 
ALAGl. If any positive definite estimate is 
available of the hessian matrix of the penalty 
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function, set |mODE| = 2 or 3, otherwise set 
|mODE| = 1 (see QNVJTA description). If 
estimates of the o^ and 0^ parameters are 
available (see item 8) set MODE < 0, otherwise 
set MODE >0. A normal setting for a one-off 
job with no information available is MODE = 1. 

3 . . LABELED COMMON AREAS : 

Certain labeled COMMON areas must be declared and set on entry to ALAGl. 

COMMON/ALGAGE/C (150) Set scale factors (>0) for the constraints in 

C(l), C(2) , . . , . ,C(M) . Choose the magnitude of 
these scale factors to give an indication of 
the constraints evaluated about the initial 
approximation x. If any constraints are violated 
by an amount greater in modulus than that which 
is set, then the setting is increased accordingly. 
These scale factors are transferred to C(M+1), 

C (M+2 ) , C (2M) by ALAGl . 

COMMON/ALAGF/GC (25 , 50) Set the derivatives of any linear constraints on 

entry rather than in ALAGB. This is the most 
efficient and the numbers are not disturbed. 

The manner of setting is described in item 4. 

COMMON/ALAGG/T (150) If MODE < 0 is used, then set the parameters 

..., in T(l), T(2),. . . ,T(M) and the 
parameters . . . ,o^ in T(M+1) ,T(M+2) , . . . ,T(2M) . 

The meaning of these parameters may be found in 
section of this report. 
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C0MM0N/ALAGI/G2P (325) If |mODE| = 2 or 3 set the estimated hessian 

matrix of the penalty function in G2P(1),. 

, G2P (N* (N+1) /2) . The manner of setting is that 

described in QNWTA under the heading MODE. 


Local storage for ALAGl is through labeled COMMON areas. These 
have been set on the assumption that N ^ 25 and M -50. If it is desired 
to remove either or both of these restrictions, then it is necessary to 
increase the storage available in some or all of these areas. This can 
be done by defining the named COMMON areas in the users MAIN with the 
increased storage settings, in which case the extra storage will be 
effective throughout the whole program. The complete list of labeled 
.COMMON used by ALAGl and the corresponding values of N and M are as follows. 

COMMON/ALAGC/F,M,K,IS,MK,NU independent of N and M 


D/G(50) 

E/C(150) 

F/GC(25,50) 

G/T(150) 

H/GP(50) 

I(G2P(325) 

J/V(50) 

K/WK150) . 

L/W(2500) 

M/ZZ(100) 


2N 

3M 

N,M 

3M 

M (p = max(M,N)) 
N- (N+D/2 
y 

3p 

2 

y 

2y 


N/LT(100) 


2M 
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4. ACCURACY: This iterative algorithm terminates normally when 

the following convergence condition is met: 

. max { I c^ (x) I /scaling factor for c^} ^ AKMIN for 

i an element of the set of active constraint indices. 

A diagnostic message for abnormal termination is 

printed when the program is unable to achieve the 

requested accuracy. This may be due to (i) a 

mistake in programming ALAGB, (ii) there is no 

feasible point (in which case a . ->■ « and c. constant 

r 1 

0), (iii) EPS has been set too large relative to 
AKMIN, (iv) the problem is too ill-conditioned. 

OTHER ROUTINES: ALAGl requires the use of ALAGB, ALAGZ, BQDMA, 

MULDA, IIULDB, MULDE, and QNWTA 

ALAGB: USER SUBROUTINE The user must define a subroutine headed by 

SUBROUTINE ALAGB (N,M,X) 

REAL X(l) 

COMMON /ALAGC/F 
COMMON/ALAGD/G (50) 

COMMON/ALAGE/C(150) 

COMMON/ALAGF/GC (25 , 50) 


This 

subroutine 

takes the vector X and sets 

(1) 

F(x) in F; 

(2) Cj^(x),..., c^(x) in C(l) , . . . ,C(M) ; 

(3) 


,3F/gx )|- in G(1),...,G(N); 


1 

n 

(4) 


.,9Ci/gx )|- in GC(1,I),... 
n 


GC (N,I) for I = 1, ; . . ,M. 
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ALAGZ: This subroutine evaluates the augmented function comprised 

of the original objective function and penalty terms that is to be 
optimized. 

SUBROUTINE ALAGZ (N, X, PHI, GPHI) 

N and X as previously defined. 

PHI is the value of the augmented function evaluated at X. 

GPHI is the gradient of the augmented function evaluated at X. 

BQDMA: The purpose of BQDMA is to find the values that minimize a 

quadratic of n variables subject to upper and lower bounds on some or 
all of the variables. 

The quadratic is defined by 

Q(X) = 1/2 X*^ AX - b’^X 
Subject to: 

BL. < X. < BU. i = 1, . . . ,N. 

1 — X — 1 

SUBROUTINE BQDMA (N , A, lA, B, BL,BU ,X,Q ,LT ,K,G) 

N an INTEGER which must be. set by the user to the number 

of variables. 

A a REAL, two dimensional array, each dimension at least 

N; the elements in the upper triangle A(I,J) must 

be set by the user to the corresponding A^^ in (1) , and 
will remain untouched by the subroutine. Elements 
A(I,J) N^I>J are used as working space. 
lA an INTEGER giving the first dimension of A in the 

statement which assigns space to A. 

B a REAL array of at least N elements. The user must set 

B(I). B is not overwritten by BQDMA. 
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BL 


BU 


X 


Q 


LT 


K 


G 


a REAL array of at least N elements. The user must 

tVi 

set BL(I) to the lower bound on the I variable. 

If the bound is non-existent, set it to a very small 

number like -1E75. BL is not overwritten by BQDMA. 

a REAL array of at least N elements. The user must 

til 

set BU(I) to the upper bound on the I variable. 

If the bound is non-existent, set it to a very large 
number. BU is not overwritten by BQDMA. 
a REAL array of at least N elements. BQDMA returns 
the solution in X(I). 

a REAL variable in which BQDMA returns the solution 
value of the quadratic. 

an INTEGER array of at least N elements, set by 
BQDMA to a permutation of the integers 1,2,...,N 
(see K and G below) 

an INTEGER set by BQDMA to the number of free 
variables at the solution (those not on their bounds) . 
These are the variables LT(1), LT(2) , . . . ,LT(K) . 
a REAL array of at least 3*N elements. G (1) , . . . . ,G (N) 
are set by BQDMA to the gradient evaluated at the 
solution point. G is indirectly addressed so that 
G(I) contains the gradient with respect to the LT(I) 
variable, whence G(l) , . . . . ,G(K) will be found to be 
zero. G(N+1) , . . . ,G(3*N) are used by BQDMA as working 
space. 
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MULDA is a subroutine for use in problems which involve the 

T 

addition or subtraction of rank one matrices a ££ to positive 

definite or semi-definite symmetric matrices A stored in factored 
T 

form A = LDL , such that the resulting N x N matrix 

T 

A = A + a 7^ 

is also known to be positive definite or semi-definite. Note that L is 
lower triangular with ^nd D is diagonal with d^ >_ 0. 

SUBROUTINE MULDA (A, N, Z, SIG, W, IR, MK, EPS) 


A 


A REAL one dimensional array of N*(N+l)/2 elements 

T 

in which the matrix A=LDL must be given in factored 
form. The order in which elements of L and D are 


stored is ‘^i»^21’^31’ ' ’ ’ ’^Nl’^^Z 



^Kt The factors of the matrix 

N-1 N,N-1’ N , 

— 

A = A + 0 zz will overwrite those of A on exit. 


N An INTEGER (N^l) which must be set to the dimension 

of the problem. 

Z A REAL one dimensional array of N elements in which 

the vector z must be set. The array Z is overwritten 
by the routine. 

SIG A REAL variable in which the scalar a must be set. 

SIG is not restricted to +. , but if SIG<0 then it 
must be known from other considerations that A is 
positive definite or semi-definite, apart from the 
effects of round-off error. 
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W 


IR 


MK 


EPS 


A REAL array of N elements. If SIG>0 then W is 
not used, and the name of any pne dimensional array 
can be inserted in the calling sequence. If 
SIG<0 then W is used as, work space. In addition for 
SIG<0 it may be possible to save time by setting in 
W the vector v defined by Lv=z . The ways in which 
this can occur are described under MK below. 

An INTEGER to be .set so that 1 ir| is the rank of A. 

If the rank of A is expected to be different from that 
of A, set IR^O. On exit from MULDA, IR(2.0) will 
contain the rank of A. 

An INTEGER to be set only when SIG<0, as follows. 

If the vector v defined by Lv=z has not been calculated 

previously, set MK=0. If MULDA has been used previously 
- 1 - — 

to calculate A z, then v is a by-product of this 
calculation and is stored in the W parameter of MULDE. 

In this case transfer v to the W parameter of MULDA 
and set MK=1. If z has been calculated as z = Au for 
some arbitrary vector u using MULDD, then again v is 
a by-product of the calculation and is available in the 
W parameter of MULDD. In this case (or any other in 
which V is known) set v in the W parameter of MULDA 
and set MK=2. 

A REAL variable to be set only when SIG<0 and A is 
expected to have the same rank as A. In certain ill- 
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conditioned cases a non-zero diagonal element 
of D might become so small as to be indeterminate. 

, Two courses of action are possible. One is to 

introduce a small perturbation in order that A 
keeps the same rank as A. This is the normal course 
of action and is achieved by setting EPS equal to 
the relative machine precision e. The other course 
of action is to let the rank of A be one less than 
the, rank of A. This is achieved by setting EPS 
equal zero. 

MULDB - factorizes a positive definite symmetric matrix given in 
A. This matrix is then used in MULDA. 

SUBROUTINE MULDB (A, N, IR) 

A Must contain the elements of A in the order 


^ll’^21’ ■ ■ ■ ’^Nl’^22’^32’ ■ ■ *^N2 ®N-1,N-1’^N,N-1’^NN; 

that is as successive columns of its lower triangle) . 

On exit A will be overwritten by the factors L and D 


in the form described in MULDA. 


N Order of the matrix A, 


IR An INTEGER set by MULDB to the rank of the factori- 

zation. If the factorization has been performed 
successfully IR=N will be set. If IR<N then the 
factorization has failed because A is not positive 
definite (possibly due to round-off error) . In this 
case the factors of a positive semi-definite matrix 
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of rank IR will be found in A. However the 
results of this calculation are unpredictable, 

. and MULDB should not be used in an attempt to 

factorize positive semi-definite matrices. 

MULDE calculates the vector z* = A ^ z where A is in factored form 

SUBROUTINE MULDE (A, N, Z, W, IR) 

A Must be set in factored form. 

N Order of the matrix A. 

Z A REAL array of N elements to be set to the vector z. 

On exit Z contains the vector z* = A ^ z. 

W A REAL array of N elements which is set by MULDE 

to be vector v defined by Lv=z. If this vector is 
not of interest, replace W by Z in the calling 
sequence to obviate the need to supply extra storage. 

IR An INTEGER which must be set to the rank of A. 

QNWTA finds the minimum of a function F(x) of several variables 
given that the gradient vector can be calculated. This routine is based 
upon a quasi-Newton method described by Fletcher in (F8 ) . 

SUBROUTINE QNWTA (FUNCT, N, X, F, G, H, W, DFN, EPS, MODE, MAXFN, 

IPRINT, lEXIT). 

FUNCT An IDENTIFIER of the users subroutine. 

N An INTEGER to be set to the number of variables (N 2) . 

X A REAL ARRAY of N elements in which the current estimate 

of the solution is stored. An initial approximation 
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must be set in X on entry to QNWTA and the best 
estimate obtained will be returned on exit. 

A REAL number in which the best value of F(x) 
corresponding to X above will be returned. 

A REAL ARRAY of N elements in which the gradient 
vector corresponding to X above will be returned. 

Not to be set on entry. 

A REAL ARRAY of N*(N+l)/2 elements in which an 
estimate of the hessian matrix is stored. The 
matrix is represented in the product form LDL 
where L is a lower triangular matrix with unit 
diagonals and D is a diagonal matrix. The lower 
triangle of L is stored by columns in H excepting 
that the unit diagonal elements are replaced by 
the corresponding elements of D. The setting of 
H on entry is controlled by the parameter MODE. 

A REAL ARRAY of 3*N elements used as working space. 

A REAL number which must be set so as to give QNWTA 
an estimate of the likely reduction to be obtained in 
F (x) . DFN is used only on the first iteration so 
an order of magnitude estimate will suffice. The 
information can be provided in different ways 
depending upon the sign of DFN which should be set 
in one of the following ways: 
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EPS 


MODE 


DFN>0 the setting of DFN itself will be 

taken as the likely reduction to be 
obtained in F(x) . 

DFN=0 it will be assumed that an estimate of 
the minimum value of F(x) has been set 
in argument F, and the likely reduction 
in F(x) will be computed according to 
the initial function value. 

DFN<0 a multiple |dFn| of the modulus of the 
initial function value will be taken as 
an estimate of the likely reduction. 

A REAL ARRAY of N elements to be set on entry to 
the accuracy required in each element of X. 

An INTEGER which controls the setting of the initial 
estimate of the hessian matrix in the parameter H. 

The following settings of MODE are permitted. 

M0DE=1 An estimate corresponding to a unit 
matrix is set in H by QNWTA. 

M0DE=2 QNWTA assumes that the hessian matrix 

itself has been set in H by columns of 

its lower triangle, and the conversion 
T 

to LDL form is carried out by QNWTA. 

The hessian matrix must be positive definite 

M0DE=3 QNWTA assumes that the hessian matrix has 


been set in H in product form. This is. 
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MAXFN 


IPRINT 


lEXIT 


convenient when using the H matrix 
from one problem as an initial 
estimate for another, in which case 
the contents of H are passed on 
unchanged . 

An INTEGER set to the maximum number of calls of 
FUNCT permitted. 

An INTEGER controlling printing. Printing occurs 
every [ IPRINT | iterations and also on exit, in the form 
Iteration No, No of calls of FUNCT, lEXIT (on 
exit only) . 

Function value 

X(l) ,X(2) , . . . ,X(N) 8 to a line 

. G(1),G(2),...,G(N) 8 to a line 

The values of. X and G can be suppressed on inter- 
mediate iterations by setting IPRINT<0. All 
intermediate printing can be suppressed by setting 
IPRINT=MAXFN+1 . All printing can be suppressed by 
setting IPR1NT=0. 

An INTEGER giving the reason for exit from QN^'/TA. 

This will be set by QNWTA as follows: 

IEXIT=0 (M0DE=2 only) . The estimate of the 

hessian matrix is not positive definite. 
IEXIT=1 The normal exit in which |dX( 1) | <EPS(I) 
for all 1=1, 2,... N, where DX(I) is the 


change in X on an iteration. 
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IEXIT=2 


T 

G DX^O. Not possible without rounding 
error. Probable cause is that EPS is 
set too small for computer word length. 


IEXIT=3 FUNCT called MAXFN times. 
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COMPUTER PROGRAM: 


LANGUAGE : FORTRAN 


ALAG 2 


TECHNICAL REFERENCES: (F8) , (P5) 
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The ALAG2 program differs from ALAGl only in the type of uncon- 
strained optimizer routine employed. Therefore, this section will only 
document this, routine and the user is referred to the documentation 
on ALAGl (except for the ALAGl routine, QNWTA) as being applicable to 
ALAG2. The unconstrained optimizer routine for ALAG2 is VAMMA. The. 
purpose of VAMM is to calculate the minimum value of a multivariate 
function. This routine uses the BFGS variable metric method without 
line searches of the type analyzed by Powell (P5) . 

SUBROUTINE VAMMA (FUNG, N, X, F, G, SCALE, ACC, W, MAXFN)' 

FUNG The name of the subroutine provided by the 

user. It must be declared in an EXTERNAL 
statement. 

N An integer whose value must be set to the 

number of variables. 

X An array of at least n elements, set by the 

user to initial values of the variables 

(x. ,x„,...,x ). Usually computing time is 
i z n 

saved if these estimates are close to the 
final solution. They are changed automatically 
to the values that give the least calculated 
value of the objective function. 

F A real variable that is set automatically to the 

least calculated value of the objective function. 

G An array of at least n elements that are set 

automatically to the components of the first 
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SCALE 


ACC 


W 


MAXFN 


derivative vector of F for the final values 
of the variables. Small values indicate a 
successful calculation. 

th 

An array of at least n elements, whose i 

component (l^i£n) must be set to a positive value 

that is a suitable change to make to initially 

in the minimization calculation. About 10% of 

the total expected change in is often a good 

value. This array is called SCALE because its 

elements should reflect the relative sizes of 

(x. , x_ . . . , X ) . 

X z n 

A real number that defines the required accuracy. 

The calculation finishes when, for i=l,2,. ..,n. 
changes in x^ of size ACC*SCALE(i) do not reduce 
the objective function. When in doubt about the 
value of ACC it is usually best to choose a small 
value. 

An array of at least -^(n+13) elements that is used 
as working space. On exit from the subroutine the 
first ^n(n+l) locations of W give the final approxi- 
mation of the second derivative matrix, stored in 
the factored form used by subroutine MULDA. 

An INTEGER set to the maximum number of calls of 


FUNC permitted. 
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BLOCK COMMON for VAMMA 

COMMON /VAMMA/ IPRINT , LP , MAXFUN , MODE , NFUN 
The five integers called IP RINT, LP , MAXFUN, MODE and NFUN are present 
in a common block in order that they can be reached by the user. 

In most calculations they can be ignored, but sometimes they are 
useful, their purpose being as follows. 

IPRINT This has a default value of zero, and is 

unchanged by VAMMA. If IPRINT=0, then no 
printing occurs except perhaps the diagnostic 
message mentioned below. Otherwise the value 
of the objective function is printed every 
I IPRINT I iterations. If IPRINT>0 the values 
of X(.) and G(.) are printed also. If IPRINT5^0 
the final values of F,X(.) and G(.) are always 
printed. 

LP This has a default value of 6, and is the stream 

number for any output from VAMMA. 

MAXFUN This has a default value of zero, in which case 

it does not influence the calculation. However, 
if it is positive, then VAMMA finishes automatically 
when the user subroutine is called MAXFUN times. 
Normal Convergence can occur earlier. 

MODE This has a default value of one, in which case the 

initial approximation to the second derivative 
matrix is set automatically to a positive diagonal 
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NFUN 


matrix. However, if a suitable positive 

definite approximation is known, then it may 

be passed to VAMMA in the first -^(n+1) 

locations of W by setting M0DE=2 or M0DE=3. 

When M0DE=2 these elements of W must contain 

the lower triangle of the Hessian approximation, 

B say, in the order B, , ,B„^ ,B.^ , . . . ,B ,, 

X-L 31 ni 

B„„,B ,...,B ,...,B ,B ,B . When 

22 32 n2 n-1 n-1 n n-1 n n 

M0DE=3 the Hessian approximation must be given 
in the factored form used by subroutine MULDA, 
which is also the form used to provide the 
Hessian approximation in W at the return from 
VAMMA. A check for positive definiteness is made 
automatically by VAMMA, and if it fails a diagnostic 
message is printed. In this case the calculation 
proceeds as though MODE-1, but the actual value of 
MODE is not changed. 

This integer is set by VAMMA to the number of times 


it calls the user subroutine. 
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COMPUTER PROGRAM: ALAG3 


. LANGUAGE: FORTRAN 


TECHNICAL REFERENCES: (F8) 
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The ALAG3 program differs from ALAGl and ALAG2 in the type of 
unconstrained optimizer routine employed. Therefore, this section 
will only dociument this routine and the user is referred to the 
documentation on ALAGl as being applicable to ALAG3. The unconstrained 
optimizer routine employed within ALAGl is referred to as FDQNW. The 
purpose of FDQNW is to calculate the minimum value of a multivariate 
function. The method used is the quasi-Newton method of ALAGl in which 
derivatives are estimated by finite difference techniques. 


SUBROUTINE FDQNW (FUNCT, N, X, F, G, H, W, DFN, XM, HH, EPS, MODE, 
MAXFN, IPRINT, lEXIT) 

FUNCT The name of the subroutine provided by the 

user. It must be declared in an EXTERNAL 
statement. 

N An INTEGER to be set to the number of variables 

(N > 2). 


X . A REAL ARRAY of N elements in which the current 

estimate of the solution is stored. An initial 
approximation must be set in X on entry to FNQNW 
and the best estimate obtained will be returned 
on exit. 

F A REAL number in which the best value of F(x) 

corresponding to X above will be returned. 

G A REAL ARRAY of N elements which is used to store 

an estimate of the gradient vector VF(x) . Not 


to be set on entry. 
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H 


W 

DFN 


A REAL ARRAY of N*(N+l)/2 elements in which an 

2 

estimate of the hessian matrix 9 F/(9x^9x^) is 

stored. The matrix is represented in the product 
T 

form LDL where L is a lower triangular matrix 
with unit diagonals and D is a diagonal matrix. 

The lower triangle of L is stored by columns in 
H excepting that the unit diagonal elements are 
replaced by the corresponding elements of D. The 
setting of H on entry is controlled by the 
parameter MODE. 

A REAL ARRAY of 3*N elements used as working space. 

A REAL number which must be set so as to give FDQNW 
an estimate of the likely reduction to be obtained 
in F (x) . DFN is used only on the first iteration so 
an order of magnitude estimate will suffice. The 
information can be provided in different ways 
depending upon the sign of DFN which should be set 
in one of the following ways: 

DFN>0 the setting of DFN itself will be 

taken as the likely reduction to be 
obtained in F (x) . 

DFN=0 it. will be assumed that an estimate of 
the minimum value of F (x) has been set 
in argument F, and the likely reduction 
in F (x) will be computed according to 


the initial function value. 



67 


XM 


HH 


EPS 


MODE 


DFN<0 a multiple |dFN| of the modulus of 

the initial function value will be taken as 

an estimate of the likely reduction. 

A REAL ARRAY of N elements to be set on entry so 

that XM(I) > 0 contains an indication of the 

magnitude of X(I). This quantity need not be set 

precisely as it is merely used in scaling the problem. 

A REAL number to be set so that HH*XM(I) contains 

a step length to be used in calculating G(I) by 

-t/2 

differences. Set HH equal to 2 where t is the 
number of significant binary digits in the calculation 
of F. 

A REAL number to be set on entry so that the accuracy 

required in X(I) is EPS*XM(I) for all I, (EPS > 0). 

An INTEGER which controls the setting of the initial 

estimate of the hessian matrix in the parameter H. 

The following settings of MODE are permitted. 

M0DE=1 An estimate corresponding to a unit 

matrix is set in H by FDQNW. 

M0DE=2 FDQNW assumes that- the hessian matrix 

itself has been set in H by columns 

of its lower triangle, and the conversion 
T 

to LDL form is carried out by FDQNW. 

The hessian matrix must be positive definite. 
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MAXFN 


IPRINT 


lEXIT 


M0DE=3 FDQNW assumes that the hessian matrix 
has been set in H in product form. This is 
convenient when using the H matrix from one 
problem as an initial estimate for another, 
in which case the contents of H are passed on 
unchanged. 

An INTEGER set to. the maximum number of calls of 
FUNCT permitted. Up to 2N more calls may be taken 
if the limit is exceeded whilst evaluating a gradient 
vector by differences. 

An INTEGER controlling printing. Printing occurs 
every | IPRINT | iterations and also on exit, in the 
form 

Iteration No., No of calls of FUNCT, lEXIT (on 
exit only) . 

Function value 

A(1),X(2),...,X(N) 8 to a line 

G(l) ,G(2) , . . . ,G(N) 8 to a line 

The values of X and G can be suppressed on inter- 
mediate iterations by setting IPRINT<0. All 
intermediate printing can be suppressed by setting 
IPR1NT=MAXFN+1 . All printing can be suppressed by 
setting IPRINT=0. 

An INTEGER giving the reason for exit from FDQNW. 

This will be set by FDQNW as follows: 
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IEXIT=0 


IEXIT=1 


IEXIT=2 


IEXIT=3 


(M0DE=2 only). The estimate of 
the hessian matrix is not positive 
definite. 

The normal exit in which |dX(I) | <EPS(I) 

for all 1=1, 2,..., N, where DX(I) is 

the change in X on an iteration. 

T 

G DX>0. Either due to rounding errors 
because EPS is set too small for the 
computer word length, or to the 
truncation error in the finite difference 
formula for G being dominant. . 

FUNCT called MAXFN times. 
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